1

When training a model a train, a validation and test set are used. I was wondering if there is any paper or example that proves that the use of an independent validation set increase the performance of the lasso estimator. I am particularly interested in situations where the penalty value is chosen through cross validation

Donbeo
  • 3,001
  • 5
  • 31
  • 48

1 Answers1

1

The independent validation is not supposed to increase the performance: it is supposed to measure the performance of the final model (and detect/monitor the optimistic bias introduced to model selection during data-driven optimization such as cross validating for the "optimal" penalty).

cbeleites unhappy with SX
  • 34,156
  • 3
  • 67
  • 133
  • thanks for answer can you write an example ? – Donbeo May 31 '14 at 13:56
  • 1
    I can, but I won't: there are already many questions and answers on the problematic of data-driven model optimization/selection available (http://stats.stackexchange.com/search?q=data-driven+model+optimization). Here are starting points for further reading: http://stats.stackexchange.com/questions/79905/cross-validation-including-training-validation-and-testing-why-do-we-need-thr http://stats.stackexchange.com/questions/5918/cross-validation-error-generalization-after-model-selection – cbeleites unhappy with SX May 31 '14 at 14:04