pca plane and its distance to the data

Question

It is clear that the first principal component is the vector which is the closest to the data, but can someone prove why the first two principal components span a plane that is the closest to the data?

I don't think closest to the data is correct. – SmallChess Mar 10 '17 at 14:17 — SmallChess, Mar 10 '17 at 14:17

score 0 · Answer 1 · answered Oct 06 '20 at 03:35

The first PC (PC1) is the linear combination that maximizes variance. If you replace the data points with PC1, this is closest to the data in the sense that it minimizes the (Euclidean) norm of the residual. Now, PC2 maximizes variance among all linear combinations orthogonal to PC1.

If you replace the data points with (PC1, PC2) again, this is the plane closest to the point swarm in the sense of minimizing the Euclidean norm of the residual vector. See Geometric understanding of PCA in the subject (dual) space

pca plane and its distance to the data

1 Answers1