How are optional stopping rules based on e.g. sample confidence (width of confidence interval) biased?

Question

Inspired by this: http://pss.sagepub.com/content/22/11/1359

In the context of open-ended data collection where the necessary sample size cannot be properly estimated, for the purpose of a frequentists test;

I understand that a stopping condition based on the main outcome is circular. For example, if I stop sampling once my p value happens to be below .05, my p value is biased (so much as to be mostly worthless). However, say I choose another stopping rule, such as the width of my 95% confidence interval (without regard for other aspects of the test, such as e.g. if the CI includes 0), am I introducing any bias (but for, of course, CI width and related statistics)?

As far as I understand it, this is not a problem in a Bayesian analysis, but I am wondering about the options for conditional stopping precluding frequentist tests.

I have seen this approach (aim for a given CI width, assessing it as you go) recommended by various authors (sorry no reference at hand). It seems related to the [accuracy in parameter estimation](http://stats.stackexchange.com/questions/16985/how-to-report-general-precision-in-estimating-correlations-within-a-context-of-j#30287) approach. — Gala, Jul 27 '13 at 13:58
Thank you for the link to AIPE @GaëlLaurans and if you happen to remember any of these "various authors", I'd love to look them up! — jona, Jul 28 '13 at 10:03
This is certainly an interesting question, which I do not have an answer at the moment. But I did find this blog entry worthy of a read, even though it concerns the width bayesian credible intervals, rather than frequentist confidence intervals: http://doingbayesiandataanalysis.blogspot.com/2013/11/optional-stopping-in-data-collection-p.html — Andrew M, Dec 04 '14 at 23:25

Harvey Motulsky · Answer 1 · 2021-04-08T14:35:51.643

Jan Vanhove presented simulations showing that optional stopping based on the width of a confidence interval does not introduce biases. He simulated a situation where the null hypothesis was true, and simulated thousands of experiments that continued adding n until the confidence interval was narrower than a prespecified limit. Since the null hypothesis is known to be true, the p-values ought to be evenly spaced between 0 and 1, and this is exactly what he saw (figure below). Optional stopping did not bias the p-value.

In those simulations, each p-value at each sample size was computed as if the study was planned to use the sample size at that point. Kruschke points out the problem that each calculated p-value, in this case, depends on assumptions that aren't correct when the data are reanalyzed repeatedly as new data are added. But the simulations seem to show this method works fine. I am not sure how to tackle this discrepancy.

How are optional stopping rules based on e.g. sample confidence (width of confidence interval) biased?

1 Answers1