Let's say that I have a binary classifier and perform leave-one-out cross-validation.
I have, then, one vector of predicted $Y_{pred}$ and true $Y_{true}$ labels.
Is it correct to perform a bootstrapping on the pairs $(Y_{pred,i}, Y_{true,i})$ to estimate the CI of the accuracy?
In other words, given a dataset with $M$ samples, $(Y_{true},Y_{pred}\in\{0,1\}^M)$:
- for $n=1, \ldots,N$:
- define $I$ by randomly select $M$ indices $i\in\{1,\ldots,M\}$ with replicates
- calculate the accuracy with the selected pairs $a_n=Accuracy(Y_{true,i},Y_{pred,i}), \quad i\in I$
This procedure gives me $N$ values for the accuracy $a_n$ from which I can estimate the CI.
Is this a correct procedure to estimate the variability of the accuracy? If not, what does this procedure estimates?