nested cross-validation

Question

if my outer cv is 5-fold, after the process, i have 5 final models, then apply these 5 final models from each CV to the whole dataset (training+validation+testing). For my case, the final 5 accuracy are: .63,.95,.92,.63,.95. what does it mean? unstable?overfitting. of course, my sample size is small, 38. what i mean is that if i have new data come in, and i want to apply the final final model to the new data. which one i should choose as the final final model. thanks a lot.

score 1 · Answer 1 · edited Apr 13 '17 at 12:44

(Nested) cross-validation is a way to estimate the performance of a modeling pipeline. In principle, it doesn't result in a final predictive model.

Various approaches exist to obtain a final model, the main ones being:

Train one overall model on the full data set that will be used for predictions and combine that with nested cross-validation estimate of its performance.
Make an ensemble of the models you've constructed in the outer cross-validation, typically through bagging.

You probably want to read the answers to this related question too: Training with the full dataset after cross-validation?

nested cross-validation

1 Answers1