1

Could somebody explain why ridge regression does not perform feature selection although it makes use of regularization? So, it penalizes the regression coefficients like LASSO does, but how come we end up with using all features for all the lambda (penalty) values in range? Why don't we end up getting some zero coefficients in case of high penalization? I know this is a very basic question, but I would appreciate any response. Thanks.

user5054
  • 1,259
  • 3
  • 13
  • 31
  • See [here](http://stats.stackexchange.com/questions/74542/why-does-the-lasso-provide-variable-selection) – Glen_b Feb 18 '14 at 07:09
  • Also see figure 3.11 in [ESL II](http://statweb.stanford.edu/~tibs/ElemStatLearn/download.html) (page labelled "71" in the 10th printing of the book) and the nearby discussion. – Glen_b Feb 18 '14 at 07:19

0 Answers0