0

In PCA we start with a dataset and we reduce its dimensions by giving it new features that are each a linear combination of the original features of the dataset, and only keeping the ones with maximum variance.

These new features of our reduced dataset are eigenvectors of the covariance matrix of our original dataset. For some reason, making the new features the eigenvectors of the original covariance matrix does two things:

  1. These new features will have a much larger variance than any of the features in the original dataset had. I don’t see why…

  2. These new features will all have zero covariance with one another…I also don’t see why.

Update - I understand why they have zero covariance with one a other: it's because the covariance matrices is symmetric, meaning that its eigenvectors are perpendicular.

Covariance can be thought of as a dot product between two vectors, and if the vectors are perpendicular, the dot product is zero.

But...I'm still confused as to why we use THOSE directions.

Nick Cox
  • 48,377
  • 8
  • 110
  • 156
joshuaronis
  • 451
  • 4
  • 13
  • Why use eigenvectors of the covariance matrix: You don't have to - there are other ways to arrive at the same solution: https://stats.stackexchange.com/q/79043/4598. Also: The first PCs have larger (or equal) variance to the original features. The last PCs have smaller variance; total variance is preserved. – cbeleites unhappy with SX May 04 '19 at 21:55
  • 2
    (1) is incorrect. At least one principal component will have variance at least as great as the largest variance among the original variables, but that's all you can say generally. – whuber May 04 '19 at 21:55
  • whuber had a good post with excellent graphs. https://stats.stackexchange.com/questions/62092/bottom-to-top-explanation-of-the-mahalanobis-distance – EngrStudent May 05 '19 at 12:10
  • We expect you to be human and polite, so confused -- that's why you're asking -- and appreciative of whatever help may come. But it's more than fine to ask your question directly without personal testimony. That's why I edited it down. – Nick Cox May 05 '19 at 12:14
  • @NickCox you're totally correct, I was just excited in the moment. Also, idk how to link a question to this one, but the one that actually helped me the most was https://math.stackexchange.com/questions/23596/why-is-the-eigenvector-of-a-covariance-matrix-equal-to-a-principal-component# ...just in case someone else needs it. – joshuaronis May 05 '19 at 22:35

0 Answers0