1

I want to perform PCA. But one of the assumptions of PCA is that there has to be a linear relationship between all variables (83 of them). Do I have to do the scatter plot matrix (83x83 matrix) between variables to see either they have linear or nonlinear relationship and filter them out if they're not satisfied? Or are there any other ways to transform all the variables to linear?

Thank you for reading this post

Trung
  • 13
  • 5

1 Answers1

1

I think PCA can be viewed in different ways. And we also can think it has no assumptions. It is just a function to map data from high dimensional space to low dimensional while satisfy some properties. (of course, if the data is non-linear, the mapping will not be good.)

Therefore, it is perfectly OK to perform PCA on a non-linear data set. And we do not need to verify the linear relationship. After the PCA, check if the results are desirable.

Here is an example on non-linear data with PCA transformation, although the answer says it is a bad model, but it may satisfy certain use cases.

How to understand "nonlinear" as in "nonlinear dimensionality reduction"?

Haitao Du
  • 32,885
  • 17
  • 118
  • 213
  • Thank you for your answer. Also, do you think should I transform my non-linear data set to linear before conducting PCA? Some also state that I only need to verify if the principal components from PCA are linear with the variables instead of checking the linear relation between the variables. – Trung Mar 07 '18 at 22:31