6

Inspired by this: http://pss.sagepub.com/content/22/11/1359

In the context of open-ended data collection where the necessary sample size cannot be properly estimated, for the purpose of a frequentists test;

I understand that a stopping condition based on the main outcome is circular. For example, if I stop sampling once my p value happens to be below .05, my p value is biased (so much as to be mostly worthless). However, say I choose another stopping rule, such as the width of my 95% confidence interval (without regard for other aspects of the test, such as e.g. if the CI includes 0), am I introducing any bias (but for, of course, CI width and related statistics)?

As far as I understand it, this is not a problem in a Bayesian analysis, but I am wondering about the options for conditional stopping precluding frequentist tests.

jona
  • 1,694
  • 12
  • 21
  • 1
    I have seen this approach (aim for a given CI width, assessing it as you go) recommended by various authors (sorry no reference at hand). It seems related to the [accuracy in parameter estimation](http://stats.stackexchange.com/questions/16985/how-to-report-general-precision-in-estimating-correlations-within-a-context-of-j#30287) approach. – Gala Jul 27 '13 at 13:58
  • Thank you for the link to AIPE @GaëlLaurans and if you happen to remember any of these "various authors", I'd love to look them up! – jona Jul 28 '13 at 10:03
  • This is certainly an interesting question, which I do not have an answer at the moment. But I did find this blog entry worthy of a read, even though it concerns the width bayesian credible intervals, rather than frequentist confidence intervals: http://doingbayesiandataanalysis.blogspot.com/2013/11/optional-stopping-in-data-collection-p.html – Andrew M Dec 04 '14 at 23:25

1 Answers1

1

Jan Vanhove presented simulations showing that optional stopping based on the width of a confidence interval does not introduce biases. He simulated a situation where the null hypothesis was true, and simulated thousands of experiments that continued adding n until the confidence interval was narrower than a prespecified limit. Since the null hypothesis is known to be true, the p-values ought to be evenly spaced between 0 and 1, and this is exactly what he saw (figure below). Optional stopping did not bias the p-value.

enter image description here

In those simulations, each p-value at each sample size was computed as if the study was planned to use the sample size at that point. Kruschke points out the problem that each calculated p-value, in this case, depends on assumptions that aren't correct when the data are reanalyzed repeatedly as new data are added. But the simulations seem to show this method works fine. I am not sure how to tackle this discrepancy.

Harvey Motulsky
  • 14,903
  • 11
  • 51
  • 98