0

I do principal component analysis (PCA) in R. Utilizing princomp() on a data.frame renames all column headers to PC0, PC1, ... in the resulting data.frame.

To filter new data using the result of a previous PCA, the unique names of the original data.frame are required. Is there a way to avoid renaming the column headers?

user259819
  • 15
  • 1
  • 6
  • 4
    Although this sounds like an R coding question (which would be off topic here), what prompts this question is actually a confusion about what PCA is & how it works. That can be cleared up here & should be considered on topic, IMO. – gung - Reinstate Monica Sep 16 '15 at 20:25
  • My understanding was flawed. As pointed out below a PCA returns weighted components of all variables. In my mind it was a linear combination with binary weights simply excluding or including an original component. Therefore it would have been possible to bijective map the result to the original components. – user259819 Sep 17 '15 at 09:05

1 Answers1

2

No, the original names are not required and, indeed, not even possible to get.

The components are weighted combinations of all the variables (original columns), so the original labels would be grossly misleading.

I am not sure what you mean by "filter" new data, but you can get the principal component scores on new data through fairly simple algebra. Most software packages can do this for you; I am sure R can, although I am not an R expert and don't know exactly how. Questions about how to do something in a specific language are off topic here, in any case.

Peter Flom
  • 94,055
  • 35
  • 143
  • 276
  • 2
    [This Cross Validated page](http://stats.stackexchange.com/q/2592/28500) provides examples in R of projecting new vectors into coordinates of a PCA. – EdM Sep 16 '15 at 20:52