3

Usually, when approaching to LASSO, the best hyperparameter lambda is assumed to be lambda.1se (in the glmnet package), which is the lambda minimizing/maxining the CV-metric (usually AUC, Accuracy or Deviance) PLUS 1 standard deviation. lamnda.1se = lambda.min + sd(lambda.min)

In this way there is a stronger penalization and the model is less prone to overfitting.

However, in caret, I always find the assumption that the best model is the corresponding to the min/max of the CV-metric. Is it wise?

Is there a way to ask caret train function to select the model corresponding to lambda.1se?

Thank you.

1 Answers1

5

Usually, when approaching to LASSO, the best hyperparameter lambda is assumed to be lambda.1se (in the glmnet package), which is the lambda minimizing/maxining the CV-metric (usually AUC, Accuracy or Deviance) PLUS 1 standard deviation.

That isn't necessarily the case. The 1-standard-error rule (lambda.1se) can provide a more parsimonious model than the minimum cross-validated error (lambda.min). But it's just a heuristic rule with no strong theoretical justification. In fact, in the illustration of cross-validation for LASSO in Section 6.6.2 of ISLR, cv.glmnet() is used with lambda.min as the criterion.

I use lambda.min as a default, and think of lambda.1se as an option only if you need to reduce the number of retained predictors even further.

EdM
  • 57,766
  • 7
  • 66
  • 187