How can we reject null hypothesis based on p-values?

Question

In textbook statistical tests, we usually calculate the probability of observing the data we observed given that the null hypothesis is true, i.e. $P[D|H_0]$. If this probability is small (e.g. $<0.05$), we claim that the null hypothesis is unlikely given the data, i.e. we reject the null hypothesis. That is, we claim that because $P[D|H_0]$ is small, $P[H_0|D]$ is also small. But that is not generally true.

As an example, this is similar to the following line of reasoning that leads to an incorrect conclusion: If a person is an American, he is probably not a member of Congress. This person is a member of Congress. Therefore, he is probably not an American. (Pollard & Richardson, 1987)

And yet, rejection of the null hypothesis based on small p-values appears to be both widely taught and widely used. Why? What are the assumptions on the Bayesian priors that allow us to reject the null hypothesis based on the low likelihood of the data under the null hypothesis?

Usually, in a hypothesis test the situation is designed so that $H_0$ and $H_1$ are contradictory and $H_1$ leads to the obtained data being (relatively) likely. The Americans/Congress example fails this: $H_1$ here is “the person is not an American”, which actually reduces the probability of the observed outcome (member of Congress). Has it perhaps been shown that under properly set up null and alternative hypothesis, the likelihood is close enough to the posterior under some sensible assumptions? — rinspy, Sep 01 '17 at 09:29
Just a tip to keep your question straightforward: the comment you placed to your own question is IMO better 'editted in' the question itself. Furthermore, as you seem to disprove of your own argument in this comment, I'm kind of at a loss what it is you are asking. Please clarify — IWS, Sep 01 '17 at 09:32
I think the premise of the question is wrong (i.e. your characterization of how the argument goes is not a suitable nor a typical argument for hypothesis testing). I don't think either Fisher or Neyman and Pearson make the claim that if $P(D|H_0)$ is small, so is $P(H_0|D)$; indeed that reads like an argument a Bayesian would use to suggest there was a problem (which if it's not how the proponents make the argument, would be something of a straw man). Where does this argument come from? — Glen_b, Sep 01 '17 at 09:35
@Glen_b It comes from the way statistical testing tends to be used in practice. For example, from Wikipedia on "Null hypothesis": "In the significance testing approach of Ronald Fisher, a null hypothesis is rejected if the observed data are significantly unlikely to have occurred if the null hypothesis were true." Is this not the same as saying that because $P[D|H_0]$ is small, we assume that $P[H_0|D]$ is also small and hence reject the null hypothesis? If we make no claim about $P[H_0|D]$, how can we reject the null hypothesis just on the basis of $P[D|H_0]$ being low? — rinspy, Sep 01 '17 at 09:43
1. It is not quite the same, no. That last question in your comment is a good one. I'd post a question about *that*. 2. You have to be careful ... WIkipedia is written by anyone at any time and periodically contains text that an accomplished statistician would not agree with; this is particularly the case with the "basic stats practice" articles like that one. Next week it will probably say something quite different. In particular in this case I'd argue that the phrase "significantly unlikely" is pretty much nonsense (though the broad gist of the part you quoted is more or less okay) — Glen_b, Sep 01 '17 at 09:55
@IWS That comment does not disprove the argument - there is just some intuition. We still need to show that $P[H_0|D]$ is low to reject the null hypothesis, right? But how can we do it if all we know is $P[D|H_0]$? Ok, we can assume that $P[D|H_0] << P[D|H_1]$, but I don't think in most cases we can assume anything about $P[H_0]$ or $P[H_1]$. — rinspy, Sep 01 '17 at 09:55
@Glen_b I will have to find the exact quote, but I am pretty sure my CFA textbook said something similar when it comes to statistical testing and rejecting $H_0$ on the basis of $P[D|H_0]$ being small. — rinspy, Sep 01 '17 at 09:59
What's "CFA?" ... Contrast the argument you make with say, the discussion in the first few pages (Part I) of [Neyman & Pearson](http://www.stats.org.uk/statistical-inference/NeymanPearson1933.pdf) for example — Glen_b, Sep 01 '17 at 10:04
@Glen_b https://en.wikipedia.org/wiki/Chartered_Financial_Analyst. Thank you, I will have a look. — rinspy, Sep 01 '17 at 10:07
Thanks; I asked because you might have been talking about a text on *Confirmatory Factor Analysis* (or indeed several other possibilities). Note that texts in many application areas often give explanations and justifications that don't stand up to scrutiny. (Unfortunately, you see some in stats-books-for-statisticians once in a while too, it's not like we're immune to a bad argument) — Glen_b, Sep 01 '17 at 10:35
see https://stats.stackexchange.com/questions/163957/what-follows-if-we-fail-to-reject-the-null-hypothesis/164094#164094 — , Sep 01 '17 at 11:54
@fcop Let $H_0$ be "A is an American". Let $D$ be "A is a member of Congress". Assuming $H_0$ is true, $D$ is very improbable. Then $H_0$ cannot be true because it leads to a 'statistical contradiction'. Therefore $H_1$ must be true, i.e. A is not an American. The probability of making a type I error is very low. — rinspy, Sep 01 '17 at 13:08
@rinspy: I am not sure that I can follow your notation: usually you have for $H_0$ something like: the average weight $\mu$ of all americans is 75kg, then you draw a sample and compute the average weight of the (americans) in your sample $\bar{x}$. In that case I would say that $H_0: \mu=75$ ($H_1: \mu \ne 75$) and your $D$ is then the sample statistic i.e. $\bar{x}$ ? Can you be more concrete about your $H_0$ and your $D$ in a particular case ? — , Sep 01 '17 at 14:23
(1) The reference to a "Bayesian prior" is irrelevant to hypothesis testing. (2) The other misconceptions reflected in the formulation of this question are addressed at https://stats.stackexchange.com/questions/31. — whuber, Sep 01 '17 at 14:41
@whuber How do we go from $P[D|H_0]$ to $P[H_0|D]$, which is what we are really interested in and what we need to reject the null hypothesis, without Bayes' rule and hence a "Bayesian prior"? — rinspy, Sep 01 '17 at 15:18
@fcop $H_0$ is a hypothesis, $D$ is an observation. That is what my question is - are there some assumptions built into the problem you described (where the hypothesis is about population mean and the observation is a sample) that allow us to reject the null hypothesis using $P[D|H_0]$ instead of $P[H_0|D]$? Why do these assumptions not hold in the Americans in Congress example? — rinspy, Sep 01 '17 at 15:43
"$P[H_0\mid D]$" has no meaning in the standard (non-Bayesian) theory of hypothesis testing. This is extensively explained in the thread I linked to. — whuber, Sep 01 '17 at 15:48
@whuber I read the thread and I understand the meaning of p-values and how they are used. But I do not understand how $P[H_0|D]$ has no meaning in this theory of hypothesis testing. The meaning is clear - probability of $H_0$ holding given that we observed $D$. And it does intuitively appear that a small likelihood should imply a small posterior in most "textbook" applications of standard hypothesis testing... I just wonder if there is some formal argument to be made here? I haven't read the Neyman & Pearson 1933 paper yet, perhaps it addresses this. — rinspy, Sep 01 '17 at 16:11
The meaning is clear *if you assume $H_0$ is random.* That assumption is not part of the theory. Without that assumption, the expression "$P[H_0\mid D]$" is *meaningless*. Thus, no formal argument can possibly be made about it. — whuber, Sep 01 '17 at 16:23
@rinspy: I asked what your concrete test would be. Of course you can find all kinds of inconsistent things like $H_0$ is 'all americans speek american' and $D$ is 'all fish can swim' ... so my example was just an example, I can find other examples that are not about the mean, but can you give a concrete example of what you want to test and what data (sample) you use for that ? It seems obvious to me that there is a link between the test you want to do and the data you use for that ? — , Sep 01 '17 at 16:24
@whuber I see. So it boils down to Bayesian vs frequentist interpretations of probability. Somehow it still feels very unsatisfying. So under the Bayesian interpretation, using the likelihood of data to reject the null hypothesis would be incorrect, right? A Bayesian *would* only be able to make a decision to reject or accept the null after calculating $P[H_0|D]$? — rinspy, Sep 01 '17 at 16:49
At present, this question is not very clear. It is either premised on incorrect assumptions or just another entry into the never ending debate about which theory of statistics should be used. Several suggestions have been made for ways to clarify this question & set it on a useful path. Please pick one. In the interim, I am closing this as unclear. — gung - Reinstate Monica, Sep 01 '17 at 17:00

How can we reject null hypothesis based on p-values?

0 Answers0