Is/are there any threshold value(s) to determine to see if PCA is useful at all, specially for high dimensional data?

Question

Apologies if this is a naïve question, but it's not so naïve to me!

Let's first assume we have 2D data which are perfectly linear but not along the x- or y-axis. PCA will rotate it so that it becomes parallel to x- or y-axis, and the direction of maximum variation will be the 'first' direction.Let's next assume that the data is deviating quite a bit from being perfectly linear. In this case, if we do PCA no matter what, the direction w/ smaller variance is going to be that of noise. My question is: is there a threshold $T$ that we can determine apriori (=before doing anything at all) from the data that'll tell us that the PCA is of use or no use? If yes, how can we determine $T$? For example, one must not apply PCA to sphere or circle-valued data $\{(cos \theta_j, sin \theta_j):1\leq j \leq n\}$, because there's no linearity there.

For two dimensional data (i.e. data with just two features), we can plot them and see if they're close to being linear or not. But for high dimensional data, what can we do? Is there a way we can determine a scalar or a set of scalars that'll tell us whether PCA is any good?

A first guess is a view at the matrix of correlations. If it is nearly diagonal, that means, nearly no correlations, it should be meaningless to single out *one dominating* direction. On the other hand, if all correlations are high, possibly an overall correlation/a general factor might explain high percentage of all (co) variance. If the correlation-values are found within subsets of variables high and small across that subsets, then a multifactor situation seems likely (for instance estimatable by varimax/promax rotations) — Gottfried Helms, May 05 '16 at 09:56
You may use [Bartlett's sphericity test](http://stats.stackexchange.com/q/92791/3277). — ttnphns, May 05 '16 at 12:27

Is/are there any threshold value(s) to determine to see if PCA is useful at all, specially for high dimensional data?

0 Answers0