1

These posts explains the PCA objective in matrix form in very nice way:

But I am still confused about the scalar notation in the Introduction to Statistical Learning, page 376:

$\quad\quad\quad\quad\quad\quad\quad\quad$enter image description here

I think $\frac 1 n \sum_{i=1}^n\sum_{j=1}^p x_{ij}^2$ makes up the covariance matrix, but how are the $\phi_{j1}$ related to the "weights"?

Could anyone help me to build the connection between these two notations?

Haitao Du
  • 32,885
  • 17
  • 118
  • 213

1 Answers1

2

Let us recapitulate what is written in the posts that you linked to.

Let $\mathbf X$ be your $n\times p$ data matrix and let it be centered, i.e. the sum of each column equals zero. The $p\times p$ covariance matrix is then equal to $\mathbf C = \frac{1}{n}\mathbf X^\top \mathbf X$.

Let $\mathbf w$ be a unit $p$-dimensional vector, i.e. $\mathbf w \in \mathbb R^p$ such that $\mathbf w^\top \mathbf w = 1$. Then the projection of the data onto the axis spanned by this vector is given by $\mathbf z = \mathbf {Xw}$. Its mean is zero, and hence its variance is simply the average squared value, i.e. $$\operatorname{Var}(\mathbf z) = \frac{1}{n}\mathbf w^\top \mathbf X^\top\mathbf{Xw}=\mathbf w^\top \mathbf {Cw}.$$ This is the PCA objective function.

Now we can spell out the matrix multiplications. The projection is given by $$z_i = \sum_j X_{ij}w_j,$$ and its average squared value is $$\operatorname{Var}(z_i)=\frac{1}{n}\sum_i z_i^2 = \frac{1}{n}\sum_i\Big[\sum_j X_{ij}w_j\Big]^2,$$ which is the formula given in your book.


Note: $\frac 1 n \sum_{i=1}^n\sum_{j=1}^p X_{ij}^2$ is not a covariance matrix, contrary to what you wrote; this expression is a single number (squared Frobenius norm of $\mathbf X$), not a matrix element. For the element of covariance matrix you should write $$C_{ij}=\frac 1 n \sum_{i=1}^n\sum_{j=1}^p X_{ki}X_{kj}.$$

amoeba
  • 93,463
  • 28
  • 275
  • 317