0

for categorical variables that are one-hot encoded, each level is now a feature' in my data. for example:

"Married" is now "Married_Y" and "Married_N", so there is a chance that only "Married_Y" will be chosen in RFE. I imagine this is not a valid way for feature selection (if i'm wrong, please let me know how to interpret this type of situation).

In this case, how do I go about doing RFE? My goal is to run RFE on a logistic regression model to understand which variables i should consider keeping or not.

semidevil
  • 117
  • 7
  • This is a FAQ! https://stats.stackexchange.com/questions/24298/can-i-ignore-coefficients-for-non-significant-levels-of-factors-in-a-linear-mode – kjetil b halvorsen Mar 17 '21 at 09:58

0 Answers0