Combining Bootstrap and Cross-Validation

Question

I am trying to think of ways of combining bootstrap and cross-validation (CV) to get out-of-sample prediction error and its confidence interval. I was initially thinking of applying this to partial least squares analyses, but the question is more general.

I've read a few papers that seem to do the bootstrap first (i.e. randomly resample the data), then run the model with CV (and repeat B times), to generate a distribution of r^2 values (or whetever metric for prediction error you like). This doesn't make sense to me, as it defeats the point of CV, as during the bootrap i think that makes 36% of your data repeated values.

Alternatively, i was thinking it could be possible to start by calcuting the CV predicted values, then bootstrap-resample these, then calculate the final prediction error stat (e.g. r^2) and repeat to get a distribution of r^2? I'm not sure if this final method is justified however, as normally in bootstrapping you resample the data first and run the model to calculate your output. In this case, the model has been run, and just the predicted values are bootstrap-resampled to generate a distribution.

Is the latter method justified? I've been trying to find papers on this which are written at a relatively accessible level...

Demetri Pananos · Accepted Answer · 2020-05-16T17:28:28.943

The bootstrap is certainly one way of assessing internal validation of a model. Ewout W. Steyerberg in his book Clinical Prediction Models describes how the bootstrap can be used to estimate optimism corrected performance. The procedure is as follows:

Construct a model in the original sample; determine the apparent performance on the data from the sample used to construct the model.
Draw a bootstrap sample (Sample*) with replacement from the original sample
Construct a model (Model*) in Sample*, replaying every step that was done in the original sample, especially model specification steps such as selection of predictors. Determine the bootstrap performance as the apparent performance of Model* in Sample*;
Apply Model* to the original sample without any modification to determine the test performance;
Calculate the optimism (Bootstrap performance - test performance).
Repeat steps 1-5 many times, at least 200, to obtain a stable mean estimate of the optimism.
Subtract the mean optimism estimate from the apparent performance to obtain the optimism corrected performance.

In this scheme, the apparent performance is determined on the sample where the model was derived from. In machine learning, this is often referred to as training error. If you're working with popular tools like caret or sklearn, Frank Harrell writes 10-fold repeated cross validation, repeated 100 times is an excellent competitor to this procedure

As for an interval estimate of the prediction error, the result of the above procedure provides an approximate sampling distribution to the optimism, and so you should be able to just subtract the apparent performance from each of the optimism bootstrap results, then estimate the interval by taking appropriate quantiles or by using bias adjusted bootstrap confidence intervals. I would search for literature on this though, because although this sounds reasonable, I am not confident it is methodologically sound.

Huh, so if you repeat cross-validation, the distribution of the cross-validated estimates (e.g. r^2) would equally work as an estimate of the sampling distribution? — user3084100, May 16 '20 at 20:46
I wouldn't use r squared as a predictive metric, but it seems to me that if you performed 10 fold cross validation, 100 times, that seems to be a fairly reliable estimate of the out of sample error. Of course, this relies on me correctly interpreting Frank. — Demetri Pananos, May 16 '20 at 20:52
Yes - but what i need is a distribution of the sampling error (i.e. the expected distribution of samples from a population) to calculate a confidence interval. That being said - the bias corrected bootstrap appears to do both! Not sure if repeats cross-validation gives you that too. — user3084100, May 17 '20 at 09:02
("Internal") material related to the Frank Harrell link: https://stats.stackexchange.com/questions/16597/bootstrapping-estimates-of-out-of-sample-error — Alex, Sep 14 '20 at 01:26

Combining Bootstrap and Cross-Validation

1 Answers1