I'm trying to run a few tests using princomp
in R
. In princomp
there is a value called loadings
, which is supposed to show many things. One of those things is the proportion of variance explained by each principal component.
For some reason, the principal components using this tool in R
always come out such that the proportion of variance
is the same for all components!!
For example, try this on the iris
dataset:
data = iris[c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width")]
iris_fit <- princomp(data, cor=TRUE)
then, when I print out the loadings
, I see this:
iris_fit$loadings
Loadings:
Comp.1 Comp.2 Comp.3 Comp.4
Sepal.Length 0.361 -0.657 -0.582 0.315
Sepal.Width -0.730 0.598 -0.320
Petal.Length 0.857 0.173 -0.480
Petal.Width 0.358 0.546 0.754
Comp.1 Comp.2 Comp.3 Comp.4
SS loadings 1.00 1.00 1.00 1.00
Proportion Var 0.25 0.25 0.25 0.25
Cumulative Var 0.25 0.50 0.75 1.00
All of the Proportion Var
row is 0.25, meaning that the variance is split evenly across the components.
So, I tried a few other datasets with, for example 2 or 3 features. And, every time, the Proportion Var
is 1 divided by the number of features!
Why is this? I want to be able to use this for feature selection, and it makes it difficult to select components/features when they are all the same. Why is this?