2

My goal is to create a perceptually balanced distance metric to compute the similarity between two scatter plots (inspired by the work on Graph-Theoretic Scagnostics).

My plan is to run a user study in which human subjects are presented with triplets of plots and have to choose which, between the one on the left and the one on the right, is the closest to the one in the center.

My reasoning is that I can then use this data to create a distance function that can predict the (perceptual) similarity of new pairs of plots.

From the little research I have done it looks like metrics learning is needed but I am not familiar with it. What would be the best way to go?

A naive solution I have in mind is to use PCA to project the training data and then for two new plots find the nearest neighbor(s) and calculate the (average) distance between them. Is there anything wrong with this idea?

ebertini
  • 31
  • 2
  • It's an interesting and well-formulated question (+1). I cannot see, though, how PCA could be applied to a set of "closest" determinations nor, if it did, how it could be induced to generate a distance function. – whuber Jul 07 '15 at 15:08
  • My reasoning is that the pairwise choices can be transformed into a distance matrix by counting the number of times a pair is chosen and then use this matrix to create principal components (apparently this is possible with a Gram matrix: [Performing PCA with only a distance matrix](http://stats.stackexchange.com/questions/87681/performing-pca-with-only-a-distance-matrix). – ebertini Jul 07 '15 at 16:04

0 Answers0