Should I Remove Correlated Features before doing Feature Selection?

Question

I'm tempted not to remove any feature before running the data through a feature selection algorithm. But is it needed depending on the algorithm? For example, a filter method such as mRMR should automatically deal with correlation via mutual information or other measure of "redundancy" between features, which is minimized as features are added to the set of selected features. Or am I missing something? And what about embedded methods, such as boosting and random forests? I realize these can be used for prediction, but I'm mainly interested in "feature importance" here.

UPDATE

To clarify, my features have physical meaning, and I don't want to obtain a reduced space and lose the physical interpretation of features. I'm primarily interested in identifying the features that are most informative with respective to a target variable. The concern is, if there are highly correlated features (and I could use Spearman or Pearson correlation coefficients to detect it, for example), should they be removed prior to running a feature selection algorithm, or not? I think that removing features that I judge highly correlated may affect the results. But maybe, depending on the algorithm, there's no need to do that beforehand.

Feature selection (FS) also called as dimensionality reduction, subspace selection, variable selection is a process to include variables showing maximum variance. The question, "is it needed depending on the algorithm?" is not the correct way to approach it. Reason, your focus should be on the objective. So if the objective is data dimensionality reduction or it is to determine relevant variables contributing to a predictive model, then FS is a necessity. Now, there are algorithms that include some sort of FS methods but then they are contextual. So you'll have to carefully inspect what type o — mnm, Aug 08 '18 at 10:59
@ferdi, can you provide peer-reviewed references to the claim that feature selection and dimensionality reduction or subspace selection are not the same? — mnm, Aug 08 '18 at 12:54
@peter-flom, can you provide peer-reviewed references to the claim that feature selection and dimensionality reduction or subspace selection are not the same? I think it will be an interesting read. I might learn something new. thanks. — mnm, Aug 08 '18 at 12:56
@nilambara. Here is a quick explanation. Let me look for something peer-reviewed. https://stats.stackexchange.com/questions/137100/what-is-the-difference-between-feature-selection-and-dimensionality-reduction — Ferdi, Aug 08 '18 at 12:58

Should I Remove Correlated Features before doing Feature Selection?

0 Answers0