Why should one scale variables before performing PCA?

Asked Feb 05 '20 at 14:39

Active Feb 05 '20 at 14:39

Viewed 25 times

Frankly, this is an old topic, but I am hoping this is a new perspective. Though it is been a while, I am pretty familiar with PCA and all the linear algebra/math beneath it. I was thinking about using PCA for some task today and was following the standard recipe to normalize features before performing PCA, but couldnt wrap my mind around one detail.

Intuitively, if the dataset is multivariate Gaussian, the contour is an ellipsoid. What PCA does is removing the shorter orthogonal axis after a linear transformation. See image below.

The part that is unclear is that after you standardize your variables, the ellipsoid becomes a ball. All the orthogonal axis have the same length. What is the point of doing PCA then?

asked Feb 05 '20 at 14:39

denizen of the north

You are mistaken: after standardization, the ellipsoid is almost never a ball. Consider the 2D case, for instance: the ellipsoid in standardized coordinates is always inscribed in a square and thus is oriented at 45 degrees to the axes. – whuber Feb 05 '20 at 15:27
1

@whuber Exactly what I am looking for. Problem solved. – denizen of the north Feb 05 '20 at 15:48

Why should one scale variables before performing PCA?

0 Answers0