There are many posts on this SE that discuss robust approaches to principal component analysis (PCA), but I cannot find a single good explanation of why PCA is sensitive to outliers in the first place.
-
5Because L2 norm contribution is very high for outliers. Then when minimizing L2 norm (which is what PCA tries to do), those points will pull harder to fit than points closer to middle will. – mathreadler Nov 26 '18 at 11:48
-
[This answer tells you everything you need.](https://stats.stackexchange.com/a/140579/1352) Just picture an outlier and read attentively. – Stephan Kolassa Nov 26 '18 at 20:15
1 Answers
One of the reasons is that PCA can be thought as low-rank decomposition of the data that minimizes the sum of $L_2$ norms of the residuals of the decomposition. I.e. if $Y$ is your data ($m$ vectors of $n$ dimensions), and $X$ is the PCA basis ($k$ vectors of $n$ dimensions), then the decomposition will strictly minimize $$\lVert Y-XA \rVert^2_F = \sum_{j=1}^{m} \lVert Y_j - X A_{j.} \rVert^2 $$ Here $A$ is the matrix of coefficients of PCA decomposition and $\lVert \cdot \rVert_F$ is a Frobenius norm of the matrix
Because the PCA minimizes the $L_2$ norms (i.e. quadratic norms) it has the same issues a least-squares or fitting a Gaussian by being sensitive to outliers. Because of the squaring of deviations from the outliers, they will dominate the total norm and therefore will drive the PCA components.

- 670
- 7
- 12