Proper Way to Combine Feature Selection and Hyperparameter Tuning?

Question

Been doing reading on feature selection and hyperparameter tuning but I'm getting lost on how to properly code/set up the experiment. I am doing a classified ML experiment, I have 1200 samples and 400 features, and would like to optimize my models. My plan is to do a stratified k-fold analysis, use RFE for feature selection and do hyperparameter tuning for models where applicable. My understanding is that both the feature selection + hyperparameter tuning should occur at each fold of the looping process? I was wondering how that would be done in python. My instinct is that I have to use some combination of RFE (or RFECV) and GridSearchCV?

Does this thought process make sense?

Split the data into training/test set, discard the test set for now.
Using the training set, use GridSearchCV to do the cross-validation w/ stratified K-fold, and embed RFE within the loop
Select the best model
Fit to Test set

OR

Split data in training/test
K-Fold RFE selection for a given model
Select those features identified by RFE
Then perform hyperparameter tuning on those features

Does this make sense? Could someone provide an example code so I can see it laid out?

Thanks!

Does this answer your question? [How should Feature Selection and Hyperparameter optimization be ordered in the machine learning pipeline?](https://stats.stackexchange.com/questions/264533/how-should-feature-selection-and-hyperparameter-optimization-be-ordered-in-the-m) — skeller88, Apr 16 '20 at 18:07

Proper Way to Combine Feature Selection and Hyperparameter Tuning?

0 Answers0