I am using a ranksum test to compare the median of two samples ($n=120000$) and have found that they are significantly different with: p = 1.12E-207
. Should I be suspicious of such a small $p$-value or should I attribute it to the high statistical power associated with having a very large sample? Is there any such thing as a suspiciously low $p$-value?
-
This is almost a duplicate of https://stats.stackexchange.com/questions/78839. – amoeba May 09 '18 at 19:53
3 Answers
P-values on standard computers (using IEEE double precision floats) can get as low as approximately $10^{-303}$. These can be legitimately correct calculations when effect sizes are large and/or standard errors are low. Your value, if computed with a T or normal distribution, corresponds to an effect size of about 31 standard errors. Remembering that standard errors usually scale with the reciprocal square root of $n$, that reflects a difference of less than 0.09 standard deviations (assuming all samples are independent). In most applications, there would be nothing suspicious or unusual about such a difference.
Interpreting such p-values is another matter. Viewing a number as small as $10^{-207}$ or even $10^{-10}$ as a probability is exceeding the bounds of reason, given all the ways in which reality is likely to deviate from the probability model that underpins this p-value calculation. A good choice is to report the p-value as being less than the smallest threshold you feel the model can reasonably support: often between $0.01$ and $0.0001$.

- 281,159
- 54
- 637
- 1,101
-
14When I reported ''$p<10^{-26}$'' in a conference paper, a reviewer told me that I should change it to ''$p<0.001$'' in order to follow APA guidelines. – Thomas Levine Jun 10 '11 at 23:15
-
4
-
2(+1) At some point it's more likely that the government is nefariously flipping bits in your RAM remotely with super spy technology... – JMS Jun 11 '11 at 04:56
-
4(+1) You can actually get down to just below $5 \times 10^{-324}$ in IEEE double precision floating point. But, your numerical routines for calculating $p$-values are almost guaranteed to fall apart before then. Unless you know for a fact that your modeling assumptions are perfectly correct (and when are they?), a $p$-value eventually just becomes a measure of the sample size once the sample gets large enough. – cardinal Jun 11 '11 at 15:18
-
1@Cardinal we're both wrong about the limits: apart from denormalized values, the [smallest IEEE double](http://en.wikipedia.org/wiki/IEEE_754-2008) is approximately $10^{-308}$, corresponding to ten bits for a base-2 exponent. – whuber Jun 11 '11 at 16:41
-
1@whuber: I was listing the smallest positive representable number, including subnormal values: $2^{-1074}$. :) – cardinal Jun 11 '11 at 16:58
-
1@ThomasLevine: isn't that just losing information? Not much information, to be sure, but it's also not saving any space, or making it easier to read. Sounds like a pointless convention... – naught101 Apr 20 '12 at 01:46
-
1I agree with @naught: especially for those trying to reproduce or check a paper, having a reasonably precise value for comparison is helpful. Changing "$\lt 10^{-26}$" to "$\lt 0.001$" erases 23 significant digits! Evidently, the APA guidelines focus on reporting results rather than on checking them -- to the detriment of all science. – whuber Aug 03 '21 at 15:06
There is nothing suspicious -- extremely low p-values like yours are pretty common when sample sizes are large (as yours is for comparing medians). As whuber mentioned, normally such p-values are reported as being less than some threshold (e.g. <0.001).
One thing to be careful about is that p-values only tells you whether the difference in median is is statistically significant. Whether the difference is significant enough in magnitude is something you will have to decide: e.g. for large sample sets, extremely small differences in means/medians can be statistically significant, but it might not mean very much.

- 2,098
- 2
- 16
- 11
A p-value can achieve a value of 0.
Suppose I am testing the composite hypothesis about the value of a range of a uniform 0, $\theta$ random variable. If I set $\mathcal{H}_0: \theta = 1$ and sample a value of $X=1.1$, you see it's impossible to observe such a value or higher under the null hypothesis. The p-value is 0.

- 52,330
- 5
- 104
- 209