Is it possible to perform the following test of hypothesis?

Question

Suppose $X_1,X_2,...,X_n$ are iid $Bernoulli(p)$. We want to test the hypothesis $H_0: p=0.5$ vs. $H_1: p=0.75$, and also the hypothesis $H_0:p=0.75$ vs. $H_1:p=0.5$ at level $\alpha=0.05$. Is it possible to construct such a test that rejects both $H_0$ at the stated level? If yes, give an example. Present some more data so that the test rejects both $H_0$ at level $\alpha=0.05$.

So it is understandable that rejecting both the nulls at level $\alpha=0.05$ is equivalent to saying that if $\phi$ be my test then $E_{p=0.5}\phi\leq 0.05$ and $E_{p=0.75}\phi\leq 0.05$.

So the test $\phi\equiv0.05$ trivially works. In fact any test function $\phi$ which is bounded above by $0.05$ works. But this is a trivial example and when I presented it, it was rejected since the examiners were looking for a test which is of the form "Reject $H_0$ if ..." i.e. a test $\phi\in\{0,1\}$.

I have no idea how to proceed in this case. Some help/hints are appreciated.

Is this a question from a course or textbook? If so, please add the `[self-study]` tag & read its [wiki](http://stats.stackexchange.com/tags/self-study/info). — gung - Reinstate Monica, Jan 07 '17 at 16:20
It is from a course in Statistics. I have added the tag. Thank you for the suggestion. — Landon Carter, Jan 07 '17 at 16:21
What class is it, & who is offering it? We don't usually set up null & alternative hypotheses that way (cf, [here](http://stats.stackexchange.com/a/159117/7290)). — gung - Reinstate Monica, Jan 07 '17 at 16:24
This is a valid representation IMHO. Many textbooks follow it. The null and alternative need not be exhaustive, but they have to be disjoint. — Landon Carter, Jan 07 '17 at 16:26
It is not consistent to do both. In the first case you could be looking for a test as to whether or not you are flipping a a fair coin so p=0.5 is the null hypothesis. The alternative choice of p=0.75 is to specify a value you want to have power against. The second case might be where you have a weighted coin that seems to give heads 75% of the time. You fix the coin and hope that have made it fair. Now you collect data on the modified coin to test the null hypothesis that the change had no effect versus the alternative that it is now fair. Clearly you want high power at p=0.5. — Michael R. Chernick, Jan 07 '17 at 16:32
The idea of switching the traditional null and alternative hypotheses comes up in clinical trials where you are comparing two drugs and need to show the regulatory agency (FDA in the US) that the two drugs are equivalent which you do when looking for a generic replacement. Similarly there is a criterion called non-inferiority that also leads to a switch of null and alternative hypotheses. — Michael R. Chernick, Jan 07 '17 at 16:38
I understand if I want high power then it is not consistent to do both. But what if I do not want to look at power at all, just try to find a test? I would ask you to consider this question not from a practitioner's perspective, but from a more academic or theoretical point of view. But anyway, even if I accept your argument, can you please provide an idea of how much extra data I can supply to test the hypotheses? — Landon Carter, Jan 07 '17 at 16:39
Actually I am just an undergraduate student. I want to be exposed to these different ideas behind estimation and hypothesis testing. Can you please suggest some references for this? I am also looking for references on formulating questions like these in the language of statistics. Can you please help me? — Landon Carter, Jan 07 '17 at 16:40
@Michael I believe you might have misinterpreted the question. The null hypothesis consists of two points. The alternative is their complement, consisting of the three intervals $[0,1/2)\cup(1/2,3/4)\cup(3/4,1]$. It's perfectly valid and not inconsistent to postulate a problem like this. — whuber, Nov 03 '17 at 23:07

score 1 · Answer 1 · answered Nov 03 '17 at 23:01

This is a nice example of an unfamiliar looking test. Studying it therefore helps us understand the fundamental concepts better.

Because the sum $t_n=X_1+X_2+\cdots+X_n$ is a sufficient statistic, we may focus the analysis on it. Let the two hypotheses be written $H_{0.5}$ and $H_{0.75}$. Under hypothesis $H_p$, $t_n$ has a Binomial$(n,p)$ distribution with possible values in the set $[n]=\{0,1,2,\ldots,n\}$.

Inspired by the Neyman-Pearson theory, we would naturally seek a critical region to be a subset of possible values of $t_n$ that have small probabilities under both hypotheses. Such values are far from both $0.5n$ and from $0.75n$, which are close to the modes under the hypotheses. Evidently such a critical region will then comprise, at most, three intervals: one from $0$ to, say, $c_1 \lt 0.5 n$; another from $c_2\gt0.5 n$ to $c_3\lt 0.75 n$; and a third from $c_4\gt 0.75n $ to $n$. The critical region will be of the form

$$C_n(\alpha) = [0, c_1] \cup [c_2,c_3] \cup [c_4, n].$$

For either hypothesis $p\in\{0.5,0.75\}$ we need

$${\Pr}_p(t_n \in C_n) \le 0.05$$

and we would like this probability to be as close to $0.05$ as possible for both hypotheses.

For large $n$ we could estimate these cutpoints $c_i$ using Normal approximations and then check. For smaller $n$ we may do a brute force search over all possibilities. I like this approach because it will not be approximate: it will give us the best possible solution, which can be instructive. (It might, however, take considerable time to find, especially for small test sizes $\alpha$! The brute force search described below takes about a minute for $\alpha=0.01$, for instance ($n=68$ is the smallest sample sizes that admits such a simultaneous test of level $0.01$).)

If more than one critical region is found for a given $n$, we might then select the one that covers the most ground--has the longest total length. That will help make it most powerful among many alternatives. Among such critical regions, if there is more than one, let's take the one for which the false positive rates are as close as possible to each other so as not to favor one hypothesis over the other.

The smallest $n$ with any solution is $n=35$. The best solution in the preceding sense is

$$C_{35}(0.05) = [0,9] \cup [22,22] \cup [33,35].$$

Here is a plot of the probability functions of both hypotheses, distinguished by color. The critical region $C_{35}(0.05)$ is located within the darker background areas. For each hypothesis, the chance of being in one of the darker areas is less than, but almost equal to, $\alpha=0.05$.

In other words, reject either hypothesis if $t_{35}$ is less than or equal to $9$, equal to $22$, or greater than or equal to $33$. The chances of this event are $0.046$ under $H_{0.5}$ and $0.0426$ under $H_{0.75}$.

Notice how both hypotheses are (of course) rejected when $t_{35}$ is near $0$ or near $n$, which is rare under either hypothesis; that both are rejected when $t=22$, which is not terribly rare under either hypothesis; and that many outcomes for which neither is rejected are outcomes that are common for one of the hypotheses but rare for the other.

Is it possible to perform the following test of hypothesis?

1 Answers1