PCA - How to show that linear combination has maximum variance

Question

Let $$x_1,..., x_n$$ be random variables with a covariance matrix $\Sigma$.

$ a^Tx:=\sum_{i=1}^na_ix_i $
$ \sum a_i^2=1 $

how do I show that the linear combination of 1 and 2 has maximal variance when a is an eigenvector of $\Sigma$ with maximal eigenvalue?

I was reading all day trying to understand the steps and I'm sure there is something that I'm missing

Does this duplicate https://stats.stackexchange.com/questions/2691/making-sense-of-principal-component-analysis-eigenvectors-eigenvalues help you? — kjetil b halvorsen, Sep 02 '21 at 17:04
@kjetilbhalvorsen No, I already read it and couldn't get the information i need to understand how to solve my problem. thank you — Noam, Sep 05 '21 at 08:02

score 1 · Accepted Answer · answered Sep 08 '21 at 04:17

There is a good, long, explanation of PCA at Making sense of principal component analysis, eigenvectors & eigenvalues so here I will only show why the eigenvector with largest eigenvalue maximizes the variance of linear combinations.

Let $\Sigma$ be the variance-covariance matrix of the random variables $X_1, X_2, \dotsc, X_n$, which we will let be the components of random vector $X$. That is, $$ \DeclareMathOperator{\V}{\mathbb{V}} \DeclareMathOperator{\C}{\mathbb{C}} \C(X_i,X_j)=\Sigma_{ij} $$ Since $`sigma$ is a symmetric matrix, we can use the spectral decomposition, that is $$ \Sigma = U \Lambda U^T = \sum_{i=1}^n \lambda_i^2 u_i u_i^T $$ where $U$ is orthogonal and $\lambda$ is diagonal. Now let $a$ be an $n$vector defining the linear combination $a^T X$. We want to choose $a$ so at to maximize the variance of the linear combination: $$ \V\left( a^T X\right) = a^T \V\left( X \right) = a^T \Sigma a \\ = a^T \sum_{i=1}^n \lambda_i^2 u_i u_i^T a \\ = \sum_{i=1}^n \lambda_i^2 (a^T u_i) (u_i^T a) $$ The vectors $u_i$ are mutually ortonormal, the sum above can be seen as an average of the $n$ values $\lambda_i^2$, and it is clear that it will be maximized if we choose $a$ so that the highest of the values $\lambda_1^2$ get all the weight. That will happen if we choose $a$ to be the $u_i$-eigenvector corresponding to the largest eigenvalue $\lambda_i$. But then the weight of the rest will be zero, by orthonormality. That finishes the proof.

Thank you for the respond! It helped me a lot. but why λ is squared? — Noam, Sep 09 '21 at 09:37

PCA - How to show that linear combination has maximum variance

1 Answers1