2

In a linear regression problem, $y = (y_1, \cdots, y_{80})$ is the response, $X = (x_1, \cdots, x_{80})$ is a $4500 \times 80$ matrix of predictors. $k = (k_1, \cdots, k_{4500})$ is the vector of regression coefficients.

I want to select $2$ features out of the $4500$ highly correlated features, so I attempt to do a feature filtering with LASSO in the first place. Then, for the reduced feature space, I can afford to perform the best subset selection.

The question is I don't know whether the number of features $M$ that I should keep after filtering is a hyperparameter. I read somewhere that if the size of the feature space exceeds $40$, it will be impractical to do the best subset selection. However, I don't know whether this applies to my case.

On the other hand, I learned that LASSO is unstable for feature selection (But I don't know what does this mean, and references will be appreciated). So it comes to me with no explicit reason that I should decide this $M$ through LOOCV.

Should I decide this parameter through LOOCV? If I should not, then how large the $M$ that I should choose?

meTchaikovsky
  • 1,414
  • 1
  • 9
  • 23
  • I've tried to answer some of your interrogations - but your question as it is probably contains too many questions to receive a specific answer. I would suggest looking for other stackexchange posts related to feature selection and LOOCV which will cover most of the points you mention here – Xavier Bourret Sicotte Sep 04 '18 at 08:14
  • @Xavier Bourret Sicotte I have read those references in your answer and I find them really helpful, thanks! – meTchaikovsky Sep 04 '18 at 08:19

1 Answers1

2

LASSO is unstable for feature selection

The combination of LASSO and Cross Validation isn't always unstable for feature selection. Factors leading to instability are:

But I don't know what does this mean, and references will be appreciated

Intuitively, a stable algorithm is one for which the prediction does not change much when the training data is modified slightly.

Here are a few references on algorithmic stability

Xavier Bourret Sicotte
  • 7,986
  • 3
  • 40
  • 72