From what I gather, a Bootstrap estimation of the generalization error for a ML procedure is optimistically biased, e.g.:
- What is the .632+ rule in bootstrapping?
- Why is bootstrapping called an "optimistic" model validator? When should I use bootstrapping or cross validation?
- Cross-validation or bootstrapping to evaluate classification performance?
As far as I understand it, the source of the optimism is that with bootstrap we are testing on much of the data that we are training with.
I presume the bias depends on what we are estimating, so there is no universal formula for it (e.g. MSE of a particular regressor). Is this correct?
If so, are there still any known universal lower or upper theoretical bounds of the bias of the bootstrap estimator? Or does it fully depend on the generalization error that we are estimating and e.g. the underlying ML model?