16

I ran PCA on 25 variables and selected the top 7 PCs using prcomp.

prc <- prcomp(pollutions, center=T, scale=T, retx=T)

I have then done varimax rotation on those components.

varimax7 <- varimax(prc$rotation[,1:7])

And now I wish to varimax rotate the PCA-rotated data (as it is not part of the varimax object - only the loadings matrix and the rotation matrix). I read that to do this you multiply the transpose of the rotation matrix by the transpose of the data so I would have done this:

newData <- t(varimax7$rotmat) %*% t(prc$x[,1:7])

But that doesn't make sense as the dimensions of the matrix transposes above are $7\times 7$ and $7 \times 16933$ respectively and so I will be left with a matrix of only $7$ rows, rather than $16933$ rows... does anyone know what I am doing wrong here or what my final line should be? Do I just need to transpose back afterwards?

amoeba
  • 93,463
  • 28
  • 275
  • 317
Scott
  • 599
  • 1
  • 5
  • 12

3 Answers3

27

"Rotations" is an approach developed in factor analysis; there rotations (such as e.g. varimax) are applied to loadings, not to eigenvectors of the covariance matrix. Loadings are eigenvectors scaled by the square roots of the respective eigenvalues. After the varimax rotation, the loading vectors are not orthogonal anymore (even though the rotation is called "orthogonal"), so one cannot simply compute orthogonal projections of the data onto the rotated loading directions.

@FTusell's answer assumes that varimax rotation is applied to the eigenvectors (not to loadings). This would be pretty unconventional. Please see my detailed account of PCA+varimax for details: Is PCA followed by a rotation (such as varimax) still PCA? Briefly, if we look at the SVD of the data matrix $X=USV^\top$, then to rotate the loadings means inserting $RR^\top$ for some rotation matrix $R$ as follows: $X=(UR)(R^\top SV^\top).$

If rotation is applied to loadings (as it usually is), then there are at least three easy ways to compute varimax-rotated PCs in R :

  1. They are readily available via function psych::principal (demonstrating that this is indeed the standard approach). Note that it returns standardized scores, i.e. all PCs have unit variance.

  2. One can manually use varimax function to rotate the loadings, and then use the new rotated loadings to obtain the scores; one needs to multiple the data with the transposed pseudo-inverse of the rotated loadings (see formulas in this answer by @ttnphns). This will also yield standardized scores.

  3. One can use varimax function to rotate the loadings, and then use the $rotmat rotation matrix to rotate the standardized scores obtained with prcomp.

All three methods yield the same result:

irisX <- iris[,1:4]      # Iris data
ncomp <- 2

pca_iris_rotated <- psych::principal(irisX, rotate="varimax", nfactors=ncomp, scores=TRUE)
print(pca_iris_rotated$scores[1:5,])  # Scores returned by principal()

pca_iris        <- prcomp(irisX, center=T, scale=T)
rawLoadings     <- pca_iris$rotation[,1:ncomp] %*% diag(pca_iris$sdev, ncomp, ncomp)
rotatedLoadings <- varimax(rawLoadings)$loadings
invLoadings     <- t(pracma::pinv(rotatedLoadings))
scores          <- scale(irisX) %*% invLoadings
print(scores[1:5,])                   # Scores computed via rotated loadings

scores <- scale(pca_iris$x[,1:2]) %*% varimax(rawLoadings)$rotmat
print(scores[1:5,])                   # Scores computed via rotating the scores

This yields three identical outputs:

1 -1.083475  0.9067262
2 -1.377536 -0.2648876
3 -1.419832  0.1165198
4 -1.471607 -0.1474634
5 -1.095296  1.0949536

Note: The varimax function in R uses normalize = TRUE, eps = 1e-5 parameters by default (see documentation). One might want to change these parameters (decrease the eps tolerance and take care of Kaiser normalization) when comparing the results to other software such as SPSS. I thank @GottfriedHelms for bringing this to my attention. [Note: these parameters work when passed to the varimax function, but do not work when passed to the psych::principal function. This appears to be a bug that will be fixed.]

amoeba
  • 93,463
  • 28
  • 275
  • 317
  • 1
    I see this now, and I think you are correct. I will edit my original answer (or add another one) to trace the source of the discrepancy. I liked your's and @ttnphns very complete and englighthening answers, providing detailed explanations not usually found in books. – F. Tusell Feb 12 '15 at 15:25
  • @amoeba I am trying to do a PCA + varimax using `principal`, `prcomp` and `princomp`, but the resulting loadings / study conclusions are very different from each other. For what I understand, prcomp and princomp do not return standardized scores nor loadings. My question is: what is the best approach? Do I really want standardized results? Isn't my code `pca_iris – JMarcelino Oct 01 '15 at 14:38
  • @JMarcelino, no, your code does varimax-rotation on the eigenvectors, not on the loadings. This is not how varimax rotation is usually understood or applied. – amoeba Oct 01 '15 at 15:01
  • @amoeba Thank you a lot for the explanation. I was loosing my mind because I thought prcomp$rotation were the loadings... That's why we need to multiply it for the standard deviation. Now, can you explain me the steps after the varimax rotation? All the inversions and transposing the matrix? Thank you in advance. – JMarcelino Oct 03 '15 at 11:26
  • 1
    @JMarcelino, are you asking why the math works as I say it does in the method #2? It's simple if you are familiar with this sort of linear algebra. PCA is SVD decomposition $X=USV^\top$. Applying a rotation such as varimax means inserting $RR^\top$ for a rotation matrix $R$ as follows: $X=URR^\top SV^\top$. Rotated loadings are $L=VSR/\sqrt{n-1}$, rotated standardized scores are $T=UR\sqrt{n-1}$, so $$X=TL^\top.$$ You know $X$ and $L$; how to get $T$? Well, the answer is $$T=X(L^\top)^+ = X(L^+)^\top.$$ – amoeba Oct 03 '15 at 19:59
  • 1
    I got an answer of the maintainer of the package Prof. Revelle. It seems to be a bug in the handling of the parameters in the `principal` procedure, which always computes with Kaiser-normalization and eps=1e-5. There is no information so far, why on r-fiddle.org the version works correctly. So we should await updates - and I should delete all the now obsolete comments. amoeba - it would be good to update the remark in your answer accordingly. Thanks for all the cooperation! – Gottfried Helms May 19 '16 at 06:53
  • The results are not exactly the same. The second and third results differ from the first one. – Nip Sep 14 '20 at 03:29
  • Why are you taking the first 5 rows? print(pca_iris_rotated$scores[1:5,]). – Emmanuel Goldstein May 17 '21 at 07:15
  • Does "psych::principal" include scaling? – Emmanuel Goldstein May 17 '21 at 07:35
  • One more thing: rawLoadings are wrongly computed. My understanding is that loadings are computed as the product of the eigenvector and the square root of the eigenvalue, but I see no sqrt there. – Emmanuel Goldstein May 17 '21 at 09:21
10

You need to use the matrix $loadings, not $rotmat:

 x <- matrix(rnorm(600),60,10)
 prc <- prcomp(x, center=TRUE, scale=TRUE)
 varimax7 <- varimax(prc$rotation[,1:7])
 newData <- scale(x) %*% varimax7$loadings

The matrix $rotmat is the orthogonal matrix that produces the new loadings from the unrotated ones.

EDIT as of Feb, 12, 2015:

As rightly pointed below by @amoeba (see also his/her previous post as well as another post from @ttnphns) this answer is not correct. Consider an $n\times m$ data matrix $X$. The singular value decomposition is $$X = USV^T$$ where $V$ has as its columns the (normalized) eigenvectors of $X'X$. Now, a rotation is a change of coordinates and amounts to writing the above equality as: $$X = (UST)(T^TV^T) = U^*V^*$$ with $T$ being an orthogonal matrix chosen to achieve a $V^*$ close to sparse (maximum contrast between entries, loosely speaking). Now, if that were all, which it is not, one could post-multiply the equality above by $V^*$ to obtain scores $U^*$ as $X(V^*)^T$, But of course we never rotate all PC. Rather, we consider a subset of $k<m$ which provides still a decent rank-$k$ approximation of $X$, $$X \approx (U_kS_k)(V_k^T)$$ so the rotated solution is now $$X \approx (U_kS_kT_k)(T_k^TV_k^T) = U_k^*V_k^*$$ where now $V_k^*$ is a $k\times n$ matrix. We cannot any more simply multiply $X$ by the transpose of $V_k^*$, but rather we need to resort to one of the solutions described by @amoeba.

In other words, the solution I proposed is only correct in the particular case where it would be useless and nonsensical.

Heartfelt thanks go to @amoeba for making clear this matter to me; I have been living with this misconception for years.

One point where the note above departs from @amoeba's post is that she/he seems to associate $S$ with $V$ in $L$. I think in PCA it is more common to have $V$'s columns of norm 1 and absorb $S$ in the principal component's values. In fact, usually those are presented as linear combinations $v_i^TX$ $(i=1,\ldots,m)$ of the original (centered, perhaps scaled) variables subject to $\|v_i\|=1$. Either way is acceptable I think, and everything in between (as in biplot analysis).

FURTHER EDIT Feb. 12, 2015

As pointed out by @amoeba, even though $V_k^*$ is rectangular, the solution I proposed might still be acceptable: $V_k^*(V_k^*)^T$ would give a unit matrix and $X(V_k^*)^T \approx U_k^*$. So it all seems to hinge on the definition of scores that one prefers.

F. Tusell
  • 7,733
  • 19
  • 34
  • 1
    Ah right grand. I got confused because the loadings for the prcomp are called "rotation", should have read the help better. Since I am using "center=TRUE,scale=TRUE" in the prcomp method does that mean that really I ought to be centering and scaling my data before multiplying it by my varimax$loadings? – Scott May 17 '13 at 14:54
  • 1
    Yes, good point, my mistake. Centering would not matter, as if only would shift the points, but the scale should be the same used to compute the principal components, which are not invariant to the scaling. – F. Tusell May 17 '13 at 16:30
  • 2
    I forgot to mention that you might want to look at function factanal, if you have not done it already. It does factor analysis rather than principal components, but will return the scores directly. – F. Tusell May 17 '13 at 16:33
  • 2
    -1. I believe that this answer is not correct and I posted my own answer to demonstrate it. One cannot get rotated scores by orthogonal projection on the rotated loadings (because they are not orthogonal anymore). The simplest way to obtain the correct scores is to use `psych::principal`. [Apart from that, I edited your answer to insert the scaling, as discussed in the comments above.] – amoeba Feb 09 '15 at 23:01
  • Thanks for the edit! I appreciate it, and I removed my downvote. Your last paragraph touches on important point, namely that one can rotate eigenvectors $V$, or one can rotate *loadings* $VS$. As far as I know, in factor analysis literature rotations are (almost?) always applied to the loadings, because people usually consider loadings as the more interpretable quantity. In contrast, in PCA literature loadings are not much mentioned, but then rotations are usually not discussed either. In any case, one can rotate whatever one wants, but it's important to be very clear about it. – amoeba Feb 12 '15 at 16:47
  • Tecnhical issue: $T_k$ is probably $k\times k$, not $k\times n$. Actually, now that I think about it, why do you say after rotation one cannot multiply $X$ by $V^*_k$? It looks to me that it would work! If I am correct, then if one rotates the eigenvectors, then your original solution actually works fine. It's only if one rotates the loadings that it does not work... By the way, note that `psych::principal` does rotate the loadings, not eigenvectors. – amoeba Feb 12 '15 at 16:52
  • 1
    Sorry, my bad. I meant $V_k^*$ is $k\times n$. I will correct it now. And... yes, now that I look at it, $V$ has orthogonal columns so $(T_k^TV_k^T)(V_kT_k)$ would still get us a unit matrix, right? If so, I did not misled the original poster, you lift a load from my soul! – F. Tusell Feb 12 '15 at 17:33
  • Yes, I think so. It was not entirely clear to me when I wrote my answer either. It all hinges on the definition of "rotation"... Your answer was assuming that we rotate "raw scores", whereas my answer was assuming that we rotate "standardized scores". But again, I am pretty sure that my approach is more standard in the factor analysis / "rotations" literature. I will edit my answer later today, to reflect this understanding and your updated answer. – amoeba Feb 12 '15 at 18:31
  • 1
    Updated my answer. One additional remark if that in your approach where it is eigenvectors (and non-scaled PCs) that are rotated, the resulting scores will not be uncorrelated anymore. Your new scores $U_kS_kT_k$ will not have diagonal covariance. – amoeba Feb 12 '15 at 21:06
  • Is this code correct or not? :) – Emmanuel Goldstein May 16 '21 at 19:25
0

I was looking for a solution that works for PCA performed using ade4.

Please find the function below:

library(ade4)

irisX <- iris[,1:4]      # Iris data
ncomp <- 2
# With ade4
dudi_iris <- dudi.pca(irisX, scannf = FALSE, nf = ncomp)

rotate_dudi.pca <- function(pca, ncomp = 2) {

  rawLoadings <- as.matrix(pca$c1[,1:ncomp]) %*% diag(sqrt(pca$eig), ncomp, ncomp)
  pca$c1 <- rawLoadings
  pca$li <- scale(pca$li[,1:ncomp]) %*% varimax(rawLoadings)$rotmat

  return(pca)
} 
rot_iris <- rotate_dudi.pca(pca = dudi_iris, ncomp = ncomp)
print(rot_iris$li[1:5,])                   # Scores computed via rotating the scores
#>        [,1]       [,2]
#> 1 -1.083475 -0.9067262
#> 2 -1.377536  0.2648876
#> 3 -1.419832 -0.1165198
#> 4 -1.471607  0.1474634
#> 5 -1.095296 -1.0949536

Created on 2020-01-14 by the reprex package (v0.3.0)

Hope this help!

  • You need to use this space for an answer. – Michael R. Chernick Jan 14 '20 at 17:42
  • It seemed to me that it is valid to add an answer for completeness. Like for this question: https://stackoverflow.com/questions/6862742/draw-a-circle-with-ggplot2. I will be happy to move my proposition if necessary. – Alain Danet Jan 14 '20 at 18:07
  • I misunderstood because it sounded like you were making a correction to an error in one of the answers. I see that it is an addition for a particular software package ad4. Cross Validated doesn't look at questions or answers that are strictly about code. Stack Overflow is where software issues are addressed. – Michael R. Chernick Jan 14 '20 at 19:06