Reference for this claim: important features in data can be "hidden" in the higher PCA axes that are typically thrown out

Question

I remember reading a paper a while ago that demonstrated some cases in which PCA would fail to capture important features of a data set in the first few principal components, but where those features would be reproduced in lower-variance components.

I think someone here recently mentioned the paper in a comment, and it jogged my memory.

I've tried doing a search on Google, Google Scholar, and my library database, but I haven't found anything. Coming up with the right search terms for something like this is not easy.

What paper is this?

One mentioned [here](http://stats.stackexchange.com/questions/87198/) or [here](http://stats.stackexchange.com/questions/101485)? Jolliffe (2010), *Principal components analysis*, deals with this topic & may give more references. — Scortchi - Reinstate Monica, Apr 15 '15 at 08:43
Here is another relevant question on this site when using PCA as a data reduction before regression, [*Principal component regression analysis using SPSS*](http://stats.stackexchange.com/q/104991/1036). In the comments to my answer I list several references (that are redundant with some of the ones Nick Stauner mentions). — Andy W, Apr 15 '15 at 11:45
@Scortchi yes it was the second question you linked to. Post that as an answer — shadowtalker, Apr 15 '15 at 12:23
Great references in the other questions as well. PCA is a hidden specialty here — shadowtalker, Apr 15 '15 at 12:24
@ssdecontrol: Good. I was thinking to mark this as a duplicate rather than post a link-only answer (I've nothing to add to it). — Scortchi - Reinstate Monica, Apr 15 '15 at 12:35
@Scortchi I don't see a problem with a terse answer if the answer is complete and correct — shadowtalker, Apr 15 '15 at 12:36
@ssdecontrol: On reflection there's little difference between looking for "examples" & "references", so I added the `feature selection` tag & marked it as a duplicate - I think the wording in your question is nice & it'll be a useful pointer to Nick's answer (& the others). — Scortchi - Reinstate Monica, Apr 15 '15 at 12:46
See also [this recent question](http://stats.stackexchange.com/questions/141864) where I tried to provide an answer that would serve as a bit of an overview of several CV threads on this topic, including ones mentioned by @Scortchi. — amoeba, Apr 15 '15 at 15:05
By the way, you refer to the low-variance components as "higher" ones in the title and as "lower" ones in the first paragraph :) I find this confusing. — amoeba, Apr 15 '15 at 15:07
@amoeba good point. I meant "lower/higher" as in "lower/higher" _index_, in that the "first" principal component is the one with the highest variance. — shadowtalker, Apr 15 '15 at 16:02
@Scortchi I'm not sure I agree in principle, but for the purpose of actually helping users find the right information I'm fine with that. Principles are overrated anyway. — shadowtalker, Apr 15 '15 at 16:04

Reference for this claim: important features in data can be "hidden" in the higher PCA axes that are typically thrown out

0 Answers0