Feature selection in the presence of extremely large feature set

Question

Suppose you have a very large feature set (1000s to 1000,000 features) when building a machine learning model. How do you go about selecting the features?

I know of the following methods for feature selection when the number of features is not extremely large: (1) (supervised) PCA
(2) Lasso (3) ElasticNet (4) Forward step-wise selection

Related to (4), step-wise VIF exclusion can be used to get rid of variables with high multicollinearity — ERT, Aug 17 '18 at 17:19
A nice article (which you have probably read), [is found here](https://machinelearningmastery.com/an-introduction-to-feature-selection/). — ERT, Aug 17 '18 at 17:21
In the ML domain, some individuals have had success with autoencoders, which act as a way to select features from data. Several variations exist, all of which allow you to input a large number of features and output a smaller number before running all of your data through a complex neural network. — ERT, Aug 17 '18 at 17:23
You may also want to look into some methods to reduce noise before implementing these methods (if your data are noisy, like market data). I've seen good results with Wavelet transform. — ERT, Aug 17 '18 at 17:27
Clustering (kMeans, hierarchical) may also return results you are looking for, giving you similar results to PCA, though [as we can see here](https://stats.stackexchange.com/q/183236/213806), the two methods are distinct. — ERT, Aug 17 '18 at 17:33
[Another interesting paper](http://www.jmlr.org/papers/volume3/guyon03a/guyon03a.pdf) goes over a number of "feature selection" or data compression methods. Sorry for all of the comments, got sucked into a wormhole. — ERT, Aug 17 '18 at 17:35

Feature selection in the presence of extremely large feature set

0 Answers0