0

I am trying to build a classification model. I have 600 variables to start with and was trying to reduce it a considerable set of variables to pass it to my model (like Logistic regression).

Did some research and it looks like PCA or Factor Analysis can be performed. But, I need to the importance of each variable. That is, I don't want to pass components or factors to model. Actual variables needs to be the predictors.

I am mostly working on SAS and there is a procedure called VARCLUS which can help in identifying the significant variables. Are there any other procedures which are recommended?

amoeba
  • 93,463
  • 28
  • 275
  • 317
user3252148
  • 11
  • 1
  • 3

1 Answers1

0

It is not recommended to use PCA or Factor analysis as a step before running linear models. Your results will be less easy to interpret and you will lose the potential predictive power that is in specific variables.

A better way to go about this, is to use some sort of stepwise variables selection technique. This is explained in this topic: Choosing variables to include in a multiple linear regression model