Based on my understanding the leave one out cross validation is to hold a sample out as the test set and fit a model with remaining data and then calculate the error of prediction of the test sample and repeat the procedure for n samples. since i am using partial least square regression, LOOCV is a good approach for determination the number of components but what about the validation? what is the best model out of n model resulted by LOOCV procedure? i read here that cross validation is a "way of estimating the generalization performance of models generated by a particular procedure" that's cool but what i am interested is to finally get a model that i can apply on some other dataset without doing the whole procedure (i.e. PLS) again.
Asked
Active
Viewed 402 times
1 Answers
3
"What i am interested is to finally get a model that i can apply on some other dataset without doing the whole procedure (i.e. PLS) again."
I think this is somewhat misguided; model selection (e.g. choosing the number of components) should be viewed as an integral part of the model fitting procedure, so you should repeat it independently every time you fit a model to a new dataset.
Note that if cross-validation is being used to choose the model, it will give an optimistically biased performance estimate, so it is better to use nested cross-validation, where the outer cross-validation is used for performance estimation and the inner-cross-validation used for model selection, independently in each fold of the outer cross-validation.

Dikran Marsupial
- 46,962
- 5
- 121
- 178