3

I did a lot of reading in this blog and elsewhere about PCA, SVD, loadings etc. But I still don't understand why loadings, which represent correlations between principal components and the original variables, are mathematically defined by

loadings = eigenvector * square root (eigenvalue)

It seems I just can't grasp it. Could somebody please explain me the mathematics behind it?

amoeba
  • 93,463
  • 28
  • 275
  • 317
Concetta
  • 31
  • 5
  • 4
    This is only true if all the original variables were standardized prior to PCA. You can find a mathematical explanation e.g. in the beginning of my answer here https://stats.stackexchange.com/questions/104306 – amoeba Dec 20 '18 at 13:13
  • 1
    Since a correlation is a *number* (between -1 and 1) and your definition of "loading" is a vector whose components could have arbitrarily large values, it isn't plausible to describe your loading as "representing correlation." – whuber Dec 20 '18 at 14:28
  • @whuber The word "correlation" should be in plural. I edited. Other than that, the question makes total sense. – amoeba Dec 20 '18 at 14:59
  • @amoeba Thank a lot for your answer. The link you posted helped my to understand loadings. But I am still struggeling with the mathematics. In your linked answer there is this equation to compute cross-covariance matrix between original variables and standardized PCs. And I think this is exactly the answer to my question. Its starts with 1/N-1 * X(transposed)* (squareroot(N-1)*U) Where does this formular comes from? I also don't understand the first transformation of the equation . It would really great if you could explain this equation to me, especially the first step. – Concetta Dec 20 '18 at 15:09
  • Cross-covariance matrix between matrices A and B (assuming both have centered columns) is `A.transposed * B / n`. Can you be more specific as to what you don't understand? Do you know what covariance is? Can you follow these matrix operations? I don't know what level of explanation you need. – amoeba Dec 20 '18 at 15:15
  • @amoeba Thanks again for your reply! I know covariance, correlation and how PCA works. But I am still struggeling with the equatuion I mentioned above. If I understood it correctly in PCA (doing a SVD) the crossvariance between the original data and the PCs should be: (1/N-1)*X.transpost * V. But your equation starts with 1/N-1 * X.transposed* (squareroot(N-1)*U). Where does this (squareroot(N-1)*U) comes from? And how does the next transformation in this equation works? Best regards and happy XMAS – Concetta Dec 22 '18 at 09:28
  • Yes, cross-covariance between the original features and the PCs is what you wrote but the cross-covariance between the original features and *standardized* PCs (i.e. PCs scaled to variance 1) is what I wrote. If your features are also standardized, then this is exactly the cross-correlation matrix. Because correlation is covariance between standardized variables. – amoeba Dec 22 '18 at 22:54
  • @amoeba I finally understand your equation, the accompanying transfromations and therefore why loadings = eigenvector * square root (eigenvalue). So just to be sure I did get it right,if my original data/variables are standardized (varinace=1) loadings=eigenvectors and using GNU R loadings =correlation loadings. Right? Thank you so very much!!! – Concetta Dec 23 '18 at 13:35
  • `if my original data/variables are standardized (varinace=1) loadings=eigenvectors` -- no, if your original variables are standardized, the eigenvalues don't need to be all equal to 1, so loadings will not in general be equal to the eigenvectors. – amoeba Dec 23 '18 at 23:36
  • @amoeba your comments were really englightening! Happy Christmas Eve! – Concetta Dec 24 '18 at 13:26

0 Answers0