My goal is to create a perceptually balanced distance metric to compute the similarity between two scatter plots (inspired by the work on Graph-Theoretic Scagnostics).
My plan is to run a user study in which human subjects are presented with triplets of plots and have to choose which, between the one on the left and the one on the right, is the closest to the one in the center.
My reasoning is that I can then use this data to create a distance function that can predict the (perceptual) similarity of new pairs of plots.
From the little research I have done it looks like metrics learning is needed but I am not familiar with it. What would be the best way to go?
A naive solution I have in mind is to use PCA to project the training data and then for two new plots find the nearest neighbor(s) and calculate the (average) distance between them. Is there anything wrong with this idea?