I know there are many questions about cross validation but I couldn't find that generally describes all aspects (I have in my mind). In How to evaluate the final model after k-fold cross-validation it is described that cross validation is used to select the best model. I wonder about the following:
- Why are different models not just trained on the same data set and tested on the same separated dataset? Why the effort for using a different data set for each model?
- The same applies when searching the best parameters for a single model (like in grid search).
Can you please clear my questions?