1

So in order to correct for optimism on my model I have used the Efron bootstrapping to calculate the optimism to correct my apparent value of several validation metrics for my model (repeated 500 times).

Now, fx in one case my apparent AUC value is 0.70, and the corrected value is 0.65. However, the calculated confidence intervals (also based on the Efron bootstrapping) is [0.66, 0.81]. So basically not containing the corrected value. And this holds for several metrics. Intuitively (at least for me) that seems weird, or at least it maybe means that my model is really bad (which may the case) ? Or maybe it should be compared to the apparent value ?

Anyways, is this normal behavior, or am I doing something wrong ?

For clarity, the way I calculate the CI is by taking the distribution of the bootstrapped metric values (i.e. model developed on the bootstrap sample and tested on the bootstrap sample (not the original bootstrap sample)), sorting from low to high (for the low CI), and finding the (0.025*500)th index/value, and then the opposite for the high CI.

Denver Dang
  • 787
  • 4
  • 15

1 Answers1

2

This behavior can occur when the sample-based estimate of the parameter is biased from the population value. You documented such bias in your bootstrapping. Insofar as bootstrap sampling from your original data represents sampling of the original data from the population, you showed that your AUC estimate from sampling is biased downward from that in the population from which you sampled by about 0.05 units. This page shows a similar result for confidence intervals of the necessarily biased plug-in estimate of Shannon entropy, and this page goes into what I remember as excruciating detail about different ways to estimate confidence intervals and how the way that you calculated confidence intervals can lead to this behavior when there is bias.

Efron and other realized this problem from early on, and additional problems arising from skew in the distributions of estimates. The BCa bootstrap, which attempts to correct for both bias and skew, is implemented in standard statistical software.

EdM
  • 57,766
  • 7
  • 66
  • 187
  • Ah, that makes sense. Does this exist in a software package of some sort, where this can be calculated directly from the results ? Because my bootstrapped results took roughly 14 hours to calculate. So if possible not going through the bootstrapping again, that would be optimal. I found this (https://stackoverflow.com/questions/55401615/r-calculate-bca-from-vector-of-bootstrapped-results), but I am not quite sure if that is legit or not. – Denver Dang Feb 01 '20 at 17:52
  • 1
    @DenverDang I haven't used that solution but see no reason why it shouldn't work. Try small-scale tests to ensure that what you get agrees with what you get with `boot.ci()` in the R boot package. I have rolled my own `boot` objects to do this, in a context where I had to remove some values to deal with errors. See the manual for `boot()` and the code for the `boot:::boot.return()` function to see how to generate them. Or adapt the code used in the underlying functions `boot:::bca.ci()`, `boot:::empinf()`, and `boot:::empinf.reg()` functions. – EdM Feb 01 '20 at 19:08
  • So now I have actually tested it, and it seems to work quite okay. However, although some of my CI estimates are changed a bit, there are still some true estimates that are outside the BCa CI. Is that just the name of the game, or is there an argument one can use to "explain" this ? – Denver Dang Feb 01 '20 at 22:42
  • @DenverDang if the BCa correction is working correctly I don't think that the bias-corrected point estimate should be outside the confidence intervals. Check the code and how the functions were called. I just found [this page](https://blogs.sas.com/content/iml/2017/07/12/bootstrap-bca-interval.html) with SAS code that you might be able to adapt. – EdM Feb 04 '20 at 15:55