Why are principal component scores uncorrelated?

Question

Supose $\mathbf A$ is a matrix of mean-centred data. The matrix $\mathbf S=\text{cov}(\mathbf A)$ is $m\times m$, has $m$ distinct eigenvalues, and eigenvectors $\mathbf s_1$, $\mathbf s_2$ ... $\mathbf s_m$, which are orthogonal.

The $i$-th principal component (some people call them "scores") is the vector $\mathbf z_i = \mathbf A\mathbf s_i$. In other words, it's a linear combination of the columns of $\mathbf A$, where the coefficients are the components of the $i$-th eigenvector of $\mathbf S$.

I don't understand why $\mathbf z_i$ and $\mathbf z_j$ turn out to be uncorrelated for all $i\neq j$. Does it follow from the fact that $\mathbf s_i$ and $\mathbf s_j$ are orthogonal? Surely not, because I can easily find a matrix $\mathbf B$ and a pair of orthogonal vectors $\mathbf x, \mathbf y$ such that $\mathbf B\mathbf x$ and $\mathbf B\mathbf y$ are correlated.

A related answer http://stats.stackexchange.com/a/110546/3277. — ttnphns, May 26 '15 at 00:52

score 8 · Accepted Answer · answered May 25 '15 at 17:41

8

$$\mathbf z_i^\top \mathbf z_j = (\mathbf A\mathbf s_i)^\top (\mathbf A\mathbf s_j) = \mathbf s_i^\top \mathbf A^\top \mathbf A \mathbf s_j = (n-1) \mathbf s_i^\top \mathbf S \mathbf s_j = (n-1) \mathbf s_i \lambda_j \mathbf s_j = (n-1) \lambda_j \mathbf s_i \mathbf s_j = 0.$$

answered May 25 '15 at 17:41

amoeba

93,463
28
275
317

2

Mathematics: what a beautiful language. – Néstor May 25 '15 at 18:00
4

This means $\mathbf z_i$ and $\mathbf z_j$ are orthogonal. Uncorrelated means that this must be true: $(\mathbf z_i-\bar z_i)^\top(\mathbf z_j-\bar z_j)=0$. I suppose somehow $\bar z_i=\bar z_j=0$, and then $\mathbf z_i^\top \mathbf z_j=0$ also implies that they're uncorrelated. – Ernest A May 25 '15 at 18:11
2

Good point, @Ernest. The means are indeed zero, because the data has been mean-centered (per your assumption). Then all projections must have mean zero. – amoeba May 25 '15 at 22:17
@amoeba: While I know inclusion of $(n-1)$ has no affect on the outcome, do you mind explaining why you included it? – Jubbles May 25 '15 at 22:25
2

@Jubbles because $S = \text{cov}(\mathbf A) = \frac{1}{n-1}\mathbf A^\top \mathbf A$, therefore $\mathbf A^\top \mathbf A = (n-1) \mathbf S$. – Ernest A May 25 '15 at 22:38
@Jubbles: yes, it is as Ernest wrote, you can also [consult wikipedia](http://en.wikipedia.org/wiki/Covariance_matrix#Estimation) if you wish. – amoeba May 25 '15 at 22:39
2

@Ernest, I could not resist from providing an answer containing no text, but perhaps I should add that the underlying reason why PCs are uncorrelated is that their covariance matrix is given by $S$ in the eigenvectors basis, and in this basis $S$ becomes diagonal -- that's the whole point of eigendecomposition. – amoeba May 26 '15 at 21:21

Why are principal component scores uncorrelated?

1 Answers1

Linked