Suppose that we perform forward stepwise regression and use cross-validation to choose the best model size.
Using the full data set to choose the sequence of models is the WRONG way to do cross-validation (we need to redo the model selection step within each training fold). If we do cross-validation the WRONG way, which of the following is true?
- The selected model will probably be too complex
- The selected model will probably be too simple.
The answer was 1, but I didn't quite understand why.