Why does this reviewer question our use of bootstrap confidence intervals?

Question

I am working with an experimental dataset mainly involving reaction times. The data are 0-bounded and positively skewed, so I have been using nonparametric tests and correlations in the analysis. The dataset contains 40 participants, each of whom have contributed 1000+ reaction times. In the descriptive statistics, I report bootstrapped confidence intervals (num. it. = 10,000), and I also use bootci (MATLAB) to produce CI for Spearman's rho. Linear modelling is performed using glmer() (R) fit with gamma distribution and log link function.

I got back some revisions and one of the reviewers said:

Why did the authors decide to go with bootstrapping? Please indicate also the classical CIs for comparison and transparency.

I am happy to explain my choice, but the request to include the classical CIs, especially as it is for "transparency", makes me wonder if there is something I may have missed in my understanding. Can anyone shed some light on the dangers of only providing bootstrapped CI?

It's going to be a lot of work to go and add them all (not to mention very cluttered in the text) so I'd like to at least soothe myself in the knowledge it's statistically worthwhile!

some measure being ckassical means that it can be easily compared to previous results and recognized by researchers used to it. But I wouldn't even know what it is the classical way to compute CIs for Spearman correlation. — carlo, Oct 23 '21 at 13:58
As a side note, that hits me as a very, excessively pedantic review, but whatever, you have to deal with what you've been dealt. — carlo, Oct 23 '21 at 14:03
You perhaps could include the classical methods in an appendix, since they’re not really part of your analysis. You’re allowed, however, to tell the reviewer that you’re not doing it. If you have good justification for doing it the way you did, that could be enough for the editor to publish your article. (Therefore, review your justification of the methods you used and why you chose them over other methods. From your description on here, you seem to be able to do such justification.) — Dave, Oct 23 '21 at 14:04
(Continued) Classical methods aren’t necessary in my mind. By analogy, I don’t need to walk or ride a horse to work to appreciate my car. — Dave, Oct 23 '21 at 14:06
With a sample of just 40, one might legitimately be concerned about the applicability of a bootstrap, whose justification is solely asymptotic, relying thereby on having large samples. Thus, comparison to intervals based on better theoretical grounds could be a worthwhile addition to your analysis. — whuber, Oct 23 '21 at 15:15
Thanks @Whuber, the sample total is 60,000+ RT, which is what I am typically bootstrapping (e.g., giving the overall mean RT across the study). In which case, the fit CI is almost identical to the boostrap CI. In the situation where I am measuring on the Participant (n = 40) level, such as for the Spearman's Rho analyses, I have yet to find a way to provide classical CI in MATLAB but any advice is appreciated. — stck8888, Oct 23 '21 at 15:42
You report data only for 40 participants: that's the sample size if you want to draw inferences about any other people than those. If indeed you are treating it in the bootstrap as a sample of 60,000, that raises questions concerning the correctness of your calculations. — whuber, Oct 23 '21 at 15:46
The following link has some discussion about a "classic" way to compute confidence intervals for Spearman's *rho*, as well as some discussion about why this method may not be great. There are also some references that apparently suggest that bootstrapped CI's may be better for *rho*. [stats.stackexchange.com/questions/18887/how-to-calculate-a-confidence-interval-for-spearmans-rank-correlation](https://stats.stackexchange.com/questions/18887/how-to-calculate-a-confidence-interval-for-spearmans-rank-correlation) — Sal Mangiafico, Oct 23 '21 at 15:48
@whuber I see what you mean - in other words, aggregate at level of participant and find appropriate CI? — stck8888, Oct 23 '21 at 15:51
Right. This sounds like a repeated-measures problem. We have many posts about such analyses, so I'm sure a little searching of this site is likely to turn up some useful suggestions for computing CIs. — whuber, Oct 23 '21 at 15:52
You can compute a classical CI as point estimator $\pm 2\sigma$. If this interval makes no sense because it extends into the negative range, this might be a good point to convince the reviewer. A problem might be, though, how to compute $\sigma$. Maybe the reviewer would accept a jackknife variance estimate. — cdalitz, Oct 23 '21 at 15:52
@whuber And for the record, this within-participant aggregation is (I am realising) not typical practice in my field and I don't think that's what the reviewer wants, although I think I understand the issue. Thank you! — stck8888, Oct 23 '21 at 15:53
@cdalitz A lot of important details are lurking in the proper interpretation of "$\sigma$"! — whuber, Oct 23 '21 at 15:53
@whuber Sorry, pressed "enter" too early (mistook it with a line break, which does not work in comments). I had added a suggestion hwo to estimate $\sigma$ under the assumption that the reviewer would not accept a botstrap estimate of $\sigma$ either. — cdalitz, Oct 23 '21 at 15:56

Why does this reviewer question our use of bootstrap confidence intervals?

0 Answers0