2

I have a binary supervised classification problem with about 62 features, by eye about 30 of them could have reasonable discriminating power. I am using sklearn and the MLP does not have a dedicated feature selection tool like decision trees do. My question is what is the recommended way to preform feature selection here? I have read in the sklearn documentation that LDA should not be performed in a binary classification problem and PCA is under the unsupervised methods on the sklearn website.

Does anyone have any experience with this that could suggest a method?

(P.S. Apologies if this question isn't up to standard, this is my first question ever asked)

MaggaP
  • 53
  • 1
  • 6

1 Answers1

2

Two suggestions:

  • One important reason to use neural network is that, the model can do "feature selection and feature engineering" automatically for us. Unless we have a huge problem (say millions features), it is not necessary to use feature selection for neural network.

  • Using PCA for feature selection on supervised learning is a bad practice, since it does not consider the "correlation" between feature and label, and direct select feature with large variance. In other words, we can have a completely useless feature but with large variance in data, and PCA will select it. See my answer here for details How to decide between PCA and logistic regression?

Haitao Du
  • 32,885
  • 17
  • 118
  • 213