0

In PCA, we often think of the principal components / eigenvectors as a linear combination of the original features in vector space. Is the reverse true as well, i.e., is each feature a linear combination of the principal components?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
David
  • 563
  • 2
  • 6
  • 1
    Do you mean *exactly* or *approximately?* – whuber Jun 29 '20 at 21:54
  • @whuber Hmm, I originally meant theoretically, so exactly. But your question makes me think that the answer is "no" to exactly, but that it can be approximately written as a linear combination of the PCs, so I would be interested in that answer as well. – David Jun 29 '20 at 22:00
  • 2
    The answer depends on how many PCs you retain. The whole idea behind PCA is to approximate the original features using a smaller number of PCs. By taking all the PCs, nothing is lost. Have you looked at our [higher voted threads on PCA?](https://stats.stackexchange.com/questions/tagged/pca?tab=Votes) – whuber Jun 29 '20 at 22:05
  • 1
    @whuber Yup. It's actually this https://stats.stackexchange.com/questions/183236/what-is-the-relation-between-k-means-clustering-and-pca one that inspired my question. The answer by amoeba says "The intuition is that PCA seeks to represent all $n$ data vectors as linear combinations of a small number of eigenvectors, and does it to minimize the mean-squared reconstruction error." I've never seen it phrased this way before, but it seems to be saying the features are represented as a linear combination of the PCs. – David Jun 29 '20 at 22:14
  • 1
    That's right. That's the basis of just about every plot of PCs that has ever been produced: the data are approximated as points in a space spanned by two (or sometimes three) PCs. – whuber Jun 29 '20 at 22:40

0 Answers0