0

I'm going to build a PCA for a set of binary variables and a ordinal variable. The binary variables values are 0 & 1 ordinal variable value is in a scale of 1 to 50

Is it possible to do PCA with this data set?

Python123
  • 73
  • 2
  • 9
  • Questions about using specific pieces of statistical software are off-topic here. You may want to rewrite the question to focus on how to perform PCA with binary and ordinal variables. When you get an answer to that question then you can look for tools in R to perform the various steps necessary. – Ian_Fin Nov 28 '16 at 10:53
  • Thank you for pointing that. I have rewritten my question – Python123 Nov 28 '16 at 10:54
  • If you have a large number of observations and a good spread of responses then with a 50 level ordinal variable you may well want to consider treating it as continuous. – Ian_Fin Nov 28 '16 at 11:01
  • No, you cannot use PCA on binary variables, though you can, in principle, do it on ordinal variables. On binary variables you can use Multiple Correspondence Analysis (MCA). – utobi Nov 28 '16 at 11:44
  • @utobi If you believed that the binary variables were the manifestation of a latent continuous variable then could you not use tetra/polychoric correlations to generate a correlation matrix to apply the PCA to? – Ian_Fin Nov 28 '16 at 12:36
  • I'm not aware of this approach for binary categorical variables. Can you point to some reference? – utobi Nov 28 '16 at 13:07
  • @utobi Not off the top of my head, but I've seen similar done with polychoric correlation and ordinal variables that reflect continuous latent variables. I don't see any obvious reason why it would be any different with binary variables as long as there was still that continuous latent variable. After all, in such a case, a binary variable is just an ordinal variable with only two levels. I'm not saying it's appropriate in all cases, but I think your claim that you cannot use PCA on binary variables is too strong and could perhaps be qualified. – Ian_Fin Nov 28 '16 at 13:39
  • @Ian_Fin I agree with the latent approach applied to ordinal data. But I believe it does not make sense for binary variables. Note, I'm talking about binary variables, e.g. gender, race, etc. and not about two-level ordinal variables. – utobi Nov 28 '16 at 14:09

0 Answers0