Till now I have used a following flow for training a random forest model.
create 10 folds of data.
for each fold i:
- use ith fold as validation data
- use remaining 9 folds as training data
- apply normalization on training and validation data
- # apply feature selection on training data
- # select same features from validation data
- train random forest on training data
- predict values for validation data
combine all predictions.
Now I want to do feature selection using varImp() function. I am confused as it is said that varImp itself trains a model on training data to find out best set of features.
How should I use varImp to get important features (say using partial least squares) and then again apply training model on training data?