does p-value distribution indicate the suitability of the statistical test

Question

This post suggests that p-value follows uniform distribution in case of point null hypothesis and continuous data.

In my project, I have millions of p-values from actual data. I don't want to get into the details of my data and statistic test. I just want to know the following:

In general, for point null hypothesis and continuous data, if the p-values do not follow uniform distribution, does it imply that 1) the null hypothesis is not correct, and/or 2)that applied statistic test is not appropriate? could there be other reasons?

Thanks,

I don't understand your question. Under these conditions the p-value under the null hypothesis is uniformly distributed *by definition*, as explained in the comments on & answers to the question you linked to. — Scortchi - Reinstate Monica, Apr 14 '16 at 15:22
Thanks for the comment! my question is that if the actual p-value does not follow uniform distribution, does it mean one used the wrong test? — blueskyddd, Apr 15 '16 at 16:03
No, it could mean that the null hypothesis is wrong... but to compare the p-value distribution with the uniform distribution, you need to perform the same study dozends of times, which is normally pathetic. — Michael M, Apr 15 '16 at 16:13
As @MichaelM says, the distribution under an alternative hypothesis won't necessarily be uniform - indeed shouldn't be if it's an alternative you want your test to have power against. I'm also puzzled as to how you're getting the distribution of actual p-values. Perhaps add an illustration of what you mean to your question. — Scortchi - Reinstate Monica, Apr 15 '16 at 16:24
I want to raise pretty much the same points as @Scortchi. How do you know "if the p-value of a statistic test does not follow uniform distribution"? You should make it clear whether you are talking about running a simulation to generate simulated p-values, or if you intend to use p-values from actual studies. — Silverfish, Apr 15 '16 at 18:06
Are you obtaining different p values for the same hypothesis or just the same *type* of hypothesis (but e.g. from different variables)? — Michael M, Apr 18 '16 at 14:28
I think it's strictly speaking the distribution of the test statistic that needs to be continuous (consider e.g. testing a one-sided hypothesis about a normal mean) - having thought about it a bit since my earlier comment. Anyway, see [Fisher's method](https://en.wikipedia.org/wiki/Fisher%27s_method) for meta-analysis. If I were you I *would* go into a bit more detail about exactly what I was doing. Does "point null" mean the distribution of the test statistic is fully specified under the null? — Scortchi - Reinstate Monica, Apr 18 '16 at 14:30
If you really have millions of p-values, then look into ideas such as *empiricall null*s, FDR, ... https://stats.stackexchange.com/questions/123402/how-do-fdr-procedures-estimate-a-false-discovery-rate-without-a-model-of-base-ra/178375#178375 — kjetil b halvorsen, Mar 09 '20 at 15:14

score 1 · Answer 1 · answered May 05 '16 at 17:58

P-values may be skewed for reasons that do not imply anything about whether the null hypothesis is correct or not; researchers only do research when the have a good reason to believe their hypothesis is valid. In that situation, we would expect p-values to be skewed high, because bad hypotheses have been filtered out by other means (like common sense, or domain knowledge). In this case the lack of normality is due to skill.

On the other hand, there could be a bias in published results one explanation is that researchers are more likely to publish p-values that are significant, and put the non-significant ones in a filing cabinet. That would inflate p-values and tend to cast doubt on results. Take this example: If a hundred scientists did the same research project, and only 5% got high p-values, and only those 5% published the results would look to be well established, but in fact just represent random variation.

Overall, I am skeptical if any conclusion can be drawn from the distribution of p-values alone.

score 1 · Answer 2 · answered Jul 14 '20 at 06:40

The distribution of the p-value under the null hypothesis can tell you whether the test achieves its actual size or not. If, say, less than 5% of the distribution falls under the 0.05 mark, the test is said to be anticonservative (rejects too often), and it's conservative for more than 5%. Fisher's Exact Test is a commonly accepted conservative test. P-under-null having a uniform distribution isn't a requirement. It also doesn't make a test "suitable" either - it could have no power, or less power than another test of the same size.

does p-value distribution indicate the suitability of the statistical test

2 Answers2