8

Can p-values between multiple pair-wise tests be considered as a similarity/distance measure and multidimensional scaling be applied over a pair-wise matrix of p-values to reduce the dimensionality? This is a soft-question, but what would be the biggest issue here, and how could that be best overcome? (ex: triangular inequality?)

qlinck
  • 285
  • 3
  • 10
  • 2
    My short answer to your first question about "similarity distance" would be No , you may refer to similar question on a different matter here: http://stats.stackexchange.com/questions/32890/can-p-values-be-used-to-show-impact-of-treatment/32897#32897 . So this posts would suggest other "divergence" or distance measures. I guess it's better to first adress your first question. – JDav Aug 20 '12 at 11:18

4 Answers4

3

If all the "true distances" are 0, then the p-values will follow a uniform distribution and would just be random, incorrect distances.
If the true distances are not 0 then you still have scaling issues where a test statisic may be more meaningful. P-values of 0.9 and 0.6 are not very different in interpretation while p-values of 0.06 and 0.01 are quite different in interpretation, but the mds algorithms would put more distance between the former than the later. You should also consider power, you may have 2 groups that have a very small distance between them, but large sample sizes so you get a small p-value; then another pair with a large difference between them, but due to small sample size (low power) you get a larger p-value.

Michael R. Chernick
  • 39,640
  • 28
  • 74
  • 143
Greg Snow
  • 46,563
  • 2
  • 90
  • 159
3

A specific case, where the p-values are generated from $\chi ^2$ tests over frequency tables were used as similarities and multidimensional scaling was applied in this paper: http://www.biomedcentral.com/content/pdf/1748-7188-1-10.pdf

hearse
  • 2,355
  • 1
  • 17
  • 30
2

I believe the answer is yes.

One could think of the "similarity" between two variables to be measured by (say) correlation. And for the p-value to be the significance of the correlation being different than 0. In such a case, a small p-value (nearing zero) is one that reflects a large distance ("difference") between the variables.

You could turn the p-values into z scores (where the "distance" of them will have to "usual" direction), and see if the methods you mentioned will make sense on that...

Tal Galili
  • 19,935
  • 32
  • 133
  • 195
1

I am not sure what you mean by "p-values between multiple pair-wise tests ". The p-value is a measure of how likely / unlikely for a particular test it would be to see a value as extreme or more extreme than what was actually observed if the null hypothesis is true. When doing pairwise testing there is no particular connection between one p-value and another. I do not see how any p-values could be looked at as a similarity measure between pairwise tests.

Michael R. Chernick
  • 39,640
  • 28
  • 74
  • 143