Why do I always fail to reject the null hypothesis on a binomial test for `p=.5`?

Question

When performing a two-tailed test on a binomial process with p=.5, why are my p-values all over the place, and I virtually never reject the null hypothesis?

I answered a question on stackoverflow about testing a random process (i.e., that a function designed to yield a binomial random variable was working correctly). I got output like this:

can reject the hypothesis that the following tests are NOT the
results of a binomial process (with their given respective
probabilities) with probability < .01, 1000000 trials each
p = 0.01 {False: 10084, True: 989916} 4.94065645841e-324 reject null
p = 0.1 {False: 100524, True: 899476} 1.48219693752e-323 reject null
p = 0.33 {False: 100633, True: 899367} 2.96439387505e-323 reject null
p = 0.5 {False: 500369, True: 499631} 0.461122365668 fail to reject
p = 0.66 {False: 900144, True: 99856} 2.96439387505e-323 reject null
p = 0.9 {False: 899988, True: 100012} 1.48219693752e-323 reject null
p = 0.99 {False: 989950, True: 10050} 4.94065645841e-324 reject null

My p-values are always all over the place for p = 0.5, but fairly consistent (almost too consistent) on the others. I'm sure if I went back into the math, I could prove to myself why this is, but it's been too long since I've studied this, and I just don't have the time right now. So who can tell me why this is, or if perhaps I've made a mistake here.

Conclusion

I will consider this fully answered: it appears the p=.5 is the null hypothesis for the test I was performing. So, semantically my problem now is how do I test a null p != .5 (i.e. two-tailed)? To those who need a analogy, say I want to test a fair coin. I have found a specific answer on the site addressing this new problem of mine. Here's the link: Testing if a coin is fair

You cannot have $H_0\!: p\ne.5$ as your null hypothesis, hypothesis testing doesn't work that way. See my recent answer here: [Why do statisticians say a non-significant result means "you can't reject the null" as opposed to accepting the null hypothesis?](http://stats.stackexchange.com/questions/85903//85914#85914) — gung - Reinstate Monica, Feb 27 '14 at 22:10
You seem to be confused about several things. Your title says 'always fail to reject' ... but your output has 6 out of 7 rejections? How do you account for that rather extreme mismatch between what you show us and how you describe it? If you do understand the problem there, *could you change your title please*? It sounds like you need to review the logic of hypothesis testing. You can't have $p\neq\frac{1}{2}$ in the null -- for starters, how would you work out the null distribution of the test statistic? — Glen_b, Feb 27 '14 at 22:31
Hi @Glen_b, thanks, the rest of the title says for "`p=.5`?" and every time I ran my code, (see link) I would see wildly varying p-values for p = .5 (being the probability of success). Keep in mind, same function, probability of success being the only variant. I hope that clears it up? — Aaron Hall, Feb 27 '14 at 22:35
It doesn't seem to address the issue I am concerned about, but I think I have figured out the problem ... it's that the output isn't doing what I thought. Nevertheless, I think the title should at least somewhat relate to what you're actually asking now. — Glen_b, Feb 27 '14 at 22:38
So is the selected answer here invalid?: http://stats.stackexchange.com/questions/5566/testing-if-a-coin-is-fair — Aaron Hall, Feb 27 '14 at 22:39

score 9 · Accepted Answer · answered Feb 27 '14 at 18:05

9

See the following paper:

 Murdock, D, Tsai, Y, and Adcock, J (2008) _P-Values are Random
 Variables_. The American Statistician. (62) 242-245.

When the null hypothesis is true then the p-values follow a uniform (or something approaching a uniform when there is a finite number of possible test statistics) distribution, so you would expect to see p-values "all over the place". When the null is false then the distribution of p-values is more heavily weighted towards 0 (hopefully) which would be the more consistent results. So my guess is that when you test with 0.5 then the null is true, and for the other cases it is false and you have enough power that there is little variation.

Also see the Pvalue.norm.sim and Pvalue.binom.sim functions in the TeachingDemos package for R for a quick way to simulate this for yourself to help see what is going on.

answered Feb 27 '14 at 18:05

Greg Snow

46,563
2
90
159

Ah, so the binomial test's null hypothesis is that the binomial process is random with p = .5? Now that makes sense. How would I test for p != .5 as the null? I do want a two tailed test. – Aaron Hall Feb 27 '14 at 18:11
2

You need to clarify what you mean by "p != .5", Aaron, because taken literally this statement is not testable: no amount of data could ever reject it. – whuber Feb 27 '14 at 18:21
@AaronHall, that depends on what program/software/algorithm you are using. Some programs (and it looks like yours) will use p=0.5 as the default null, but you can specify any value that you want. How to specify that depends on the program being used. – Greg Snow Feb 27 '14 at 21:53

Why do I always fail to reject the null hypothesis on a binomial test for `p=.5`?

Conclusion

1 Answers1