1

I have created composite variables after a factor analysis. The original variables were categorical (levels 1-7) and the composites were created by averaging the items loading into each factor.

I decided to round the new variables resulting from the averages to the nearest units. It seemed logical, as the original variables were categories, I thought the resulting composite variable should also be categories.

I am using these rounded variables as dependent variables in ANOVA's.

Is the rounding a bad thing? Should I ditch it? Keep it?

I couldn't find a clear answer on this elsewhere.

Alexis
  • 26,219
  • 5
  • 78
  • 131
GIP
  • 19
  • 1
  • Welcome to CV, Guillermo Ivan Pereira! I think this is an important question. Can you clarify whether those categories are *ordered* or *unordered*? That is, is category 2 substantively *more than* category 1 and *less than* category 3? (For example, does 2 mean something like "more intensity", "more capable," or "more emotion" than 1, and vice versa for 3?) – Alexis Feb 11 '18 at 19:10
  • Yes, these are ordinal categories, 1 being the least and 7 the most in the category. – GIP Feb 11 '18 at 19:15

1 Answers1

-1

I'd suggest something entirely different. Factor analysis assumes normality for the variables. But as you said, your variables are categorical, so the assumption of normality is violated. It is advisable to use optimal scaling methods like a nonlinear PCA (available in SPSS: Data Reduction > Optimal Scaling) to create composite variables when your original variables are categorical. I believe you can also do it using homals package in R. The next step is usually breaking up the composite variable into terciles, quintiles or whatever you think is appropriate to use it in future analysis. This is how you should make a composite variable categorical. Similar methods are widely used to create a categorical composite variable called wealth index. First, you retain the scores of the first component and then categorize it into quintiles. For your case, you can categorize the composite score into 7 categories if that is appropriate.

Blain Waan
  • 3,345
  • 1
  • 30
  • 35
  • Thank you for the help. My variables don't have major violations of normality, and in the literature I've seen the use of factor analysis as useful in this cases. So, back to the original question and before I try your suggestion, is the rounding adequate following my approach? or I should not round the Factor Based Scores to the nearest unit? Thank you. – GIP Feb 11 '18 at 20:47
  • Could you please explain how can a categorical variable be treated as normally distributed? You know that normal distribution is actually for continuous random variables, this probability distribution is not for a categorical variable. A factor analysis is going to treat all your categorical variables as continuous. So, if you use the factor scores as your composite variables they are also going to be in a continuous scale. Rounding those has no relation with your original variables being categorical or not. Does that make sense now? – Blain Waan Feb 12 '18 at 03:25
  • Thank you Blain for the detail. I have done the non-linear PCA in SPSS. Now I have the components from the analysis as an output. Can you please guide me on how to create composite variables from this output?Can I average the values of the variables with stronger load on each component? I couldn't find a detailed explanation on how to create the composite variable after the CATPCA is conducted. – GIP Feb 14 '18 at 00:23
  • Hi GIP, the component score you get by clicking the save component scores, is already an weighted average of your original variables (or you can say, their latent variables). So, now you can just take the 1st component score. Divide it into quintiles or terciles or the number of categories you want. That gives you a categorized index. If you want to use the index as a continuous variable in your future analysis, then just use the 1st component score as the index. Sorry for late reply. – Blain Waan Feb 21 '18 at 22:17