Why do PCA loadings given by sqrt(eigenvalue)*eigenvector yield correlations between PCs and original variables?

Question

I did a lot of reading in this blog and elsewhere about PCA, SVD, loadings etc. But I still don't understand why loadings, which represent correlations between principal components and the original variables, are mathematically defined by

loadings = eigenvector * square root (eigenvalue)

It seems I just can't grasp it. Could somebody please explain me the mathematics behind it?

This is only true if all the original variables were standardized prior to PCA. You can find a mathematical explanation e.g. in the beginning of my answer here https://stats.stackexchange.com/questions/104306 — amoeba, Dec 20 '18 at 13:13
Since a correlation is a *number* (between -1 and 1) and your definition of "loading" is a vector whose components could have arbitrarily large values, it isn't plausible to describe your loading as "representing correlation." — whuber, Dec 20 '18 at 14:28
@whuber The word "correlation" should be in plural. I edited. Other than that, the question makes total sense. — amoeba, Dec 20 '18 at 14:59
@amoeba Thank a lot for your answer. The link you posted helped my to understand loadings. But I am still struggeling with the mathematics. In your linked answer there is this equation to compute cross-covariance matrix between original variables and standardized PCs. And I think this is exactly the answer to my question. Its starts with 1/N-1 * X(transposed)* (squareroot(N-1)*U) Where does this formular comes from? I also don't understand the first transformation of the equation . It would really great if you could explain this equation to me, especially the first step. — Concetta, Dec 20 '18 at 15:09
Cross-covariance matrix between matrices A and B (assuming both have centered columns) is `A.transposed * B / n`. Can you be more specific as to what you don't understand? Do you know what covariance is? Can you follow these matrix operations? I don't know what level of explanation you need. — amoeba, Dec 20 '18 at 15:15
@amoeba Thanks again for your reply! I know covariance, correlation and how PCA works. But I am still struggeling with the equatuion I mentioned above. If I understood it correctly in PCA (doing a SVD) the crossvariance between the original data and the PCs should be: (1/N-1)*X.transpost * V. But your equation starts with 1/N-1 * X.transposed* (squareroot(N-1)*U). Where does this (squareroot(N-1)*U) comes from? And how does the next transformation in this equation works? Best regards and happy XMAS — Concetta, Dec 22 '18 at 09:28
Yes, cross-covariance between the original features and the PCs is what you wrote but the cross-covariance between the original features and *standardized* PCs (i.e. PCs scaled to variance 1) is what I wrote. If your features are also standardized, then this is exactly the cross-correlation matrix. Because correlation is covariance between standardized variables. — amoeba, Dec 22 '18 at 22:54
@amoeba I finally understand your equation, the accompanying transfromations and therefore why loadings = eigenvector * square root (eigenvalue). So just to be sure I did get it right,if my original data/variables are standardized (varinace=1) loadings=eigenvectors and using GNU R loadings =correlation loadings. Right? Thank you so very much!!! — Concetta, Dec 23 '18 at 13:35
`if my original data/variables are standardized (varinace=1) loadings=eigenvectors` -- no, if your original variables are standardized, the eigenvalues don't need to be all equal to 1, so loadings will not in general be equal to the eigenvectors. — amoeba, Dec 23 '18 at 23:36
@amoeba your comments were really englightening! Happy Christmas Eve! — Concetta, Dec 24 '18 at 13:26

Why do PCA loadings given by sqrt(eigenvalue)*eigenvector yield correlations between PCs and original variables?

0 Answers0