Top features for each principal component are not unique for PCA dimension reduction

Question

I am using scikit-learn's PCA to reduce the features of my dataset. To determine which feature in the original space are most associated with which principal component, I take a look at the sklearn.decomposition.PCA. components_ attribute; per the documentation, this is an array of dimensions [number_of_principal_components, number_of_features].

I simply iterate over each component, take the absolute value of the associated vector, and sort the descendingly. What I am seeing for each principal component (PC) is that for the first PC (PC1) and second PC (PC2), the influential features are repeated. For example, feature_1 shows up twice as the top 5 associated features for PC1 and PC2.

I thought that for PCA, each feature would just map to 1 single PC, and there after minimally for the other PCs. Is this understanding incorrect?

Please note that it is recommended to standardize the values before doing PCA, but I am only re-scaling the values to the unit interval [0, 1]. But even if I use standardize values before applying PCA, I still observe the same thing happening.

What I want to ultimately do is to re-map the features onto PCs, then take the top 3 features influencing each PC, and just use those for modeling.

Any help in understanding is helpful.

See [this superb answer](https://stats.stackexchange.com/a/27310/28500) to a similar question on this site. If your features are correlated (as they typically are in real life) then each principal component will tend to include contributions from all features, and individual features could well be included in all principal components. — EdM, Jul 08 '17 at 18:50
"What I want to ultimately do is to re-map the features onto PCs, then take the top 3 features influencing each PC, and just use those for modeling." What problem are you trying to solve in doing this? — Matthew Drury, Jul 08 '17 at 20:58
It seems you want to use PCA for Feature Selection, whereas PCA basically is dimensionality reduction using all (!) features. There are multiple questions & great answers explaining PCA in this forum. Reading those will clear all your questions and doubts. — Nikolas Rieble, Jul 10 '17 at 09:20
@MatthewDrury I am trying to cluster my data. What I do is transform the data via PCA, then perform the clustering. After the clusters are found, then I need to be able to describe these clusters in the original space (not the PCA space). — Jane Wayne, Jul 11 '17 at 11:06

Top features for each principal component are not unique for PCA dimension reduction

0 Answers0