2

I have been told to run a factor analysis using the stepwisefit function in MATLAB.

Basically, this function helps you fit a model composed of $T$ factors $F=(f_1, ... , f_T)$ each of which have $N$ values with a target vector $y$ with the same length.

I know for a fact that my factors are correlated, and I was wondering if this function was assuming that the factors are uncorrelated or not...

Should I still use this function or, if it's not the case, which function/method should I choose to perform my analysis?

SRKX
  • 254
  • 1
  • 10
  • Strange. I don't see any reference to factor analysis under the link. It reads that `stepwisefit` is stepwise regression analysis – ttnphns Jul 14 '12 at 14:30
  • Maybe my wording is wrong, I understand factor analysis as a multiple regression over different factors. – SRKX Jul 14 '12 at 14:31
  • I can't know what you might intend. You can see definition of "factor analysis" by pointing on the tag (or read in Wikipedia). Does its meaning fit your case? – ttnphns Jul 14 '12 at 14:35
  • Yes, it absolutely does. Basically, I have a lot of underlying factors and I would like to know which of them are really meaningful. I would then like to find the right weight to the remaining factors to find my optimal fit. – SRKX Jul 14 '12 at 14:41
  • @SRKX, why don't you simply drop the correlated factors from further analysis? It is easy to do. Moreover, *stepwisefit* will do this for you. – Paul Jul 14 '12 at 15:54
  • I don't know that MATLAB function *per se*, but I imagine it works like any other stepwise selection algorithm. It's worth pointing out here that stepwise model selection algorithms are invalid. (If that doesn't make sense to you / you want to understand why, it may help to read my answer here: [algorithms-for-automatic-model-selection](http://stats.stackexchange.com/questions/20836//20856#20856).) Moreover, multicollinearity isn't a reason to use stepwise selection routines, the existence of correlated variables makes the problems associated with stepwise selection worse. – gung - Reinstate Monica May 23 '13 at 04:13

1 Answers1

1

You should use the method "Elastic Net", instead of the Stepwise method.

You can put the highly correlated predictors in your multiple linear regression, however, you won't get accurate results from your stepwise analysis unless you regularize your regression.

The regularized linear regression method is "lasso" in Matlab. When you are using that, you should set the alpha to something between 0 to 1, in order to use the Elastic Net method, instead of the default Lasso method. That is because you have predictors which are correlated, and you would like your regression model to consider this fact.

Niousha
  • 361
  • 3
  • 8