Principal component analysis when original variables hardly correlate

Question

I am fairly familiar with the practical application of principal component analysis (PCA). PCA tries to find the first PC, for example, by minimization of the sum of squared perpendicular distances of x1, x2, . . . , xn from the lower-dimensional subspace.

However, as far as I understand PCA does not add any 'value' as a data-reduction technique if the variables x1, x2, . . . , xn in the original design matrix X hardly correlate. In case of zero correlations, for instance, the orthonormal matrix B ends up being the identity matrix and the transformation Z = XB yields the initial design matrix X.

However, how could I show mathematically that PCA performs still poor in cases where the correlations between the original variables x1, x2, . . . , xn in the design matrix X is low but still larger than zero? My hunch is that in those cases the rotations of the original coordinate axes is rather insubstantial. Or in other words, the angle between the original coordinate axes and the transformed coordinate axes is small. But how to prove this? Thanks

Even tiniest correlation can lead to a large rotation. Consider 2x2 correlation matrix with 0.000000001 off-diagonal. PC1 will point 45 degrees away from both original axes. — amoeba, Oct 19 '17 at 18:35
I don't think this comment is worth an answer but it will be long. Consider the SVD of $X, U\Sigma V^T$. The orthogonal matrix on the right corresponds to the principal directions, this can be large even when the data is nearly uncorrelated as @amoeba pointed out. On the other hand, if there is little correlation then the diagonals of $\Sigma$, which correspond to the eigenvalues of A, will be roughly equal. In this case, reducing to one PC from two would amount to halving the information in your data. See this brilliant post for more detail: — David Kozak, Oct 19 '17 at 18:47
cont: https://stats.stackexchange.com/questions/134282/relationship-between-svd-and-pca-how-to-use-svd-to-perform-pca — David Kozak, Oct 19 '17 at 18:48

Principal component analysis when original variables hardly correlate

0 Answers0