After reading this, I am still not sure how to intuitively understand why eigenvectors of a covariance matrix represents direction of the LARGEST variance.
Need more explanation on:
Then $w^{T}Cw$ simplifies to $\sum\lambda_i w_i^{2}$ , in other words the variance is given by the weighted sum of the eigenvalues. It is almost immediate (why?) that to maximize this expression one should simply take $w=(1,0,0,…,0)$, i.e. the first eigenvector, yielding variance $\lambda_1$."
Here $C$ should be the original $C$ before decomposition? or both $C$ and $w$ have changed basis? If they changed basis, how do we know the original variance $w^{T}Cw$ is the same as the new variance $w^{T}Cw$?