1

I'm trying to run a few tests using princomp in R. In princomp there is a value called loadings, which is supposed to show many things. One of those things is the proportion of variance explained by each principal component.

For some reason, the principal components using this tool in R always come out such that the proportion of variance is the same for all components!!

For example, try this on the iris dataset:

data = iris[c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width")]
iris_fit <- princomp(data, cor=TRUE)

then, when I print out the loadings, I see this:

iris_fit$loadings

Loadings:
             Comp.1 Comp.2 Comp.3 Comp.4
Sepal.Length  0.361 -0.657 -0.582  0.315
Sepal.Width         -0.730  0.598 -0.320
Petal.Length  0.857  0.173        -0.480
Petal.Width   0.358         0.546  0.754

               Comp.1 Comp.2 Comp.3 Comp.4
SS loadings      1.00   1.00   1.00   1.00
Proportion Var   0.25   0.25   0.25   0.25
Cumulative Var   0.25   0.50   0.75   1.00

All of the Proportion Var row is 0.25, meaning that the variance is split evenly across the components.

So, I tried a few other datasets with, for example 2 or 3 features. And, every time, the Proportion Var is 1 divided by the number of features!

Why is this? I want to be able to use this for feature selection, and it makes it difficult to select components/features when they are all the same. Why is this?

amoeba
  • 93,463
  • 28
  • 275
  • 317
  • 3
    The second part of that print isn't what you think it is. Try `summary(iris_fit)` or `?print.loadings` and `print.default(iris_fit$loadings)` – user20637 Jun 24 '16 at 07:51
  • 2
    If I remember correctly, it has been noticed on this site several times that what that function calls "loadings" are in fact eigenvectors, not loadings. Search this site for `princomp loadings`, `pca loadings vs eigenvectors`. – ttnphns Jun 24 '16 at 10:48

0 Answers0