1

I want to compare solutions provided by two instances of PCA, either due to two different implementations or due to two subsets of the data.

One possible way is to calculate the similarity between the loadings (e.g. using the inner product). But as this post discusses, the signs of the loadings elements are meaningless.

Then, what is the correct way to compare the two PCA outputs?

Is it valid to just take the absolute of the loadings?

EDIT

There is some confusion about loadings and eigenvectors. I am using loadings to say the eigenvetors scaled up by the square root of corresponding eigenvalues. So, which one is (loadings or eigenvectors) the better choice to compare the PCA solutions and why?

PS: @ttnphns commented that using loadings is the right thing to do but it is not clear to me why.

Krrr
  • 476
  • 6
  • 15
  • 1
    If some rotatedness of one loading structure relative the other isn' the source of their discrepancy - do procrustes rotation (with option of reflection not switched off) to superimpose them, and compute cosine aka Tucker's coef. of congruence. If rotatedness isn't allowed just compute the coefficient; yes you may reflect ie change sign of loadings in any component. – ttnphns Aug 21 '17 at 17:31
  • @ttnphns thanks for your comment. So the sign of all elements (equal to the number of features of the data) of one component loadings must be charged, or it can be done for each element separately? – Krrr Aug 21 '17 at 17:39
  • You are free to change sign of all values (loadings or scores) in any component, but not of individual values. – ttnphns Aug 21 '17 at 18:37
  • @ttnphns another related question. To compare the two instances of PCA solutions is it more appropriate to use loadings (scaled eigenvectors) or eigenvectors? – Krrr Aug 23 '17 at 16:21
  • 2
    Loadings, not eigenvectors. – ttnphns Aug 23 '17 at 20:12
  • 1
    `ttnphns commented that using loadings is the right thing to do but it is not clear to me why` Because [loadings](https://stats.stackexchange.com/q/143905/3277) contain info about eigenvalues as well, eigenvectors don't. Eigenvectors don't say about amount association between a component and an item. – ttnphns Aug 24 '17 at 08:16
  • 1
    Because the loadings can be interpreted as major axes of an ellipsoid, you can equate them with a (unique) positive-semidefinite matrix. This permits you to apply any solutions that might be posted to the question at https://stats.stackexchange.com/questions/299521. – whuber Aug 24 '17 at 14:31
  • I was curious if you came up with a solution for this comparison? I am undertaking something similar. – tnt Jun 04 '20 at 02:47

0 Answers0