I performed (sklearn) PCA on a (1416960,140)
pandas DataFrame.
The resulting components_
attribute is a matrix where each principal component is associated to an array with the directions of maximum variance for each feature.
In order to get which feature is more "correlated" to each component, I just get which feature has the higher (absolute) variance for each component (as shown also in: https://stackoverflow.com/questions/50796024/feature-variable-importance-after-a-pca-analysis)
My problem is that for multiple components the highest variance is given by the same feature, indeed i get different components with the same most important feature. Can I avoid this behavior? What is it caused by? What can I do in order to mitigate this issue?
I couldn't find any thread on this topic.