25

I am using a ranksum test to compare the median of two samples ($n=120000$) and have found that they are significantly different with: p = 1.12E-207. Should I be suspicious of such a small $p$-value or should I attribute it to the high statistical power associated with having a very large sample? Is there any such thing as a suspiciously low $p$-value?

amoeba
  • 93,463
  • 28
  • 275
  • 317
N26
  • 1,705
  • 3
  • 18
  • 22

3 Answers3

32

P-values on standard computers (using IEEE double precision floats) can get as low as approximately $10^{-303}$. These can be legitimately correct calculations when effect sizes are large and/or standard errors are low. Your value, if computed with a T or normal distribution, corresponds to an effect size of about 31 standard errors. Remembering that standard errors usually scale with the reciprocal square root of $n$, that reflects a difference of less than 0.09 standard deviations (assuming all samples are independent). In most applications, there would be nothing suspicious or unusual about such a difference.

Interpreting such p-values is another matter. Viewing a number as small as $10^{-207}$ or even $10^{-10}$ as a probability is exceeding the bounds of reason, given all the ways in which reality is likely to deviate from the probability model that underpins this p-value calculation. A good choice is to report the p-value as being less than the smallest threshold you feel the model can reasonably support: often between $0.01$ and $0.0001$.

whuber
  • 281,159
  • 54
  • 637
  • 1,101
  • 14
    When I reported ''$p<10^{-26}$'' in a conference paper, a reviewer told me that I should change it to ''$p<0.001$'' in order to follow APA guidelines. – Thomas Levine Jun 10 '11 at 23:15
  • 4
    @whuber - Beautifully stated. – rolando2 Jun 11 '11 at 04:24
  • 2
    (+1) At some point it's more likely that the government is nefariously flipping bits in your RAM remotely with super spy technology... – JMS Jun 11 '11 at 04:56
  • 4
    (+1) You can actually get down to just below $5 \times 10^{-324}$ in IEEE double precision floating point. But, your numerical routines for calculating $p$-values are almost guaranteed to fall apart before then. Unless you know for a fact that your modeling assumptions are perfectly correct (and when are they?), a $p$-value eventually just becomes a measure of the sample size once the sample gets large enough. – cardinal Jun 11 '11 at 15:18
  • 1
    @Cardinal we're both wrong about the limits: apart from denormalized values, the [smallest IEEE double](http://en.wikipedia.org/wiki/IEEE_754-2008) is approximately $10^{-308}$, corresponding to ten bits for a base-2 exponent. – whuber Jun 11 '11 at 16:41
  • 1
    @whuber: I was listing the smallest positive representable number, including subnormal values: $2^{-1074}$. :) – cardinal Jun 11 '11 at 16:58
  • 1
    @ThomasLevine: isn't that just losing information? Not much information, to be sure, but it's also not saving any space, or making it easier to read. Sounds like a pointless convention... – naught101 Apr 20 '12 at 01:46
  • 1
    I agree with @naught: especially for those trying to reproduce or check a paper, having a reasonably precise value for comparison is helpful. Changing "$\lt 10^{-26}$" to "$\lt 0.001$" erases 23 significant digits! Evidently, the APA guidelines focus on reporting results rather than on checking them -- to the detriment of all science. – whuber Aug 03 '21 at 15:06
17

There is nothing suspicious -- extremely low p-values like yours are pretty common when sample sizes are large (as yours is for comparing medians). As whuber mentioned, normally such p-values are reported as being less than some threshold (e.g. <0.001).

One thing to be careful about is that p-values only tells you whether the difference in median is is statistically significant. Whether the difference is significant enough in magnitude is something you will have to decide: e.g. for large sample sets, extremely small differences in means/medians can be statistically significant, but it might not mean very much.

xuexue
  • 2,098
  • 2
  • 16
  • 11
3

A p-value can achieve a value of 0.

Suppose I am testing the composite hypothesis about the value of a range of a uniform 0, $\theta$ random variable. If I set $\mathcal{H}_0: \theta = 1$ and sample a value of $X=1.1$, you see it's impossible to observe such a value or higher under the null hypothesis. The p-value is 0.

AdamO
  • 52,330
  • 5
  • 104
  • 209