Does it affect the guarantees of a test to take more samples if the test fails to reject, then perform it again?

Question

For example, suppose I am concerned with two Bernoulli RVs and want to test the alternative $p_1 < p_2$ against the null $p_1 \geq p_2$. I take some samples from each distribution, and discover that the empirical proportions satisfy $\hat{p_1} < \hat{p_2}$, but the test doesn't reject the null because I didn't take enough samples to establish significance. I then gather more samples, perform the same test at the same significance level (using both my old samples and my new ones), and this time it does reject the null.

Is this "algorithm" kosher? Or can the fact that the new samples taken are conditioned on the old samples satisfying $\hat{p_1} < \hat{p_2}$ and the failure of the initial test to reject somehow affect the correctness of the procedure? If it does affect the correctness of the procedure, is there still a problem if I do not reuse the old samples and only use the new ones for the next test?

score 2 · Accepted Answer · answered Sep 20 '17 at 15:41

2

Your suspicions are correct: the $p$-values you compute this way won't be correct because of the dependency on the previous $p$-values. And if you threw out the old data when computing a new $p$, you'd be sacrificing a lot of power. What you want is a sequential hypothesis test.

answered Sep 20 '17 at 15:41

Kodiologist

19,063
2
36
68

I see -- thank you for the reply! After doing some googling, I can find some extensions for likelihood ratio tests in the simple case, but I'm having trouble finding a general technique, something with reasonably good power (not necessarily optimal) but that can be applied in a cookie-cutter fashion to a lot of problems. In particular, I'm looking for a technique that will test an unknown # of hypotheses, reusing old data and terminating at the first reject, but where the exact hypothesis tested may be different each time as it is data-dependent. Does this ring any familiar bells? Thanks! – smacke Sep 21 '17 at 05:07
@smacke I think sequential hypothesis-testing is the closest thing to that. That's the goal of it. – Kodiologist Sep 21 '17 at 05:20

Does it affect the guarantees of a test to take more samples if the test fails to reject, then perform it again?

1 Answers1