I am using scikit-learn's PCA to reduce the features of my dataset. To determine which feature in the original space are most associated with which principal component, I take a look at the sklearn.decomposition.PCA. components_ attribute; per the documentation, this is an array of dimensions [number_of_principal_components, number_of_features].
I simply iterate over each component, take the absolute value of the associated vector, and sort the descendingly. What I am seeing for each principal component (PC) is that for the first PC (PC1) and second PC (PC2), the influential features are repeated. For example, feature_1 shows up twice as the top 5 associated features for PC1 and PC2.
I thought that for PCA, each feature would just map to 1 single PC, and there after minimally for the other PCs. Is this understanding incorrect?
Please note that it is recommended to standardize the values before doing PCA, but I am only re-scaling the values to the unit interval [0, 1]. But even if I use standardize values before applying PCA, I still observe the same thing happening.
What I want to ultimately do is to re-map the features onto PCs, then take the top 3 features influencing each PC, and just use those for modeling.
Any help in understanding is helpful.