I want to get some intuition on normalization prior to feature selection with PCA.
I'm sure z-normalization is a bad idea, since it normalizes the variances to 1 for each feature, PCA will be meaningless since it will randomly select the most varying features. Right?
But my data consists of features from very different ranges (some are in the order of 1e-1, whereas some are 1e-8) so I want to combine them in a discriminative classifier.
How can I approach this normalization + feature reduction problem?
Thanks for any help !