0

I am building a driver model to understand what makes my survey respondent a Promoter/Passive/Detractor. We have used survey satisfaction drivers as my independent variables.Now Fast and Friendly satisfaction variable is coming highly correlated(~80%). Both have an impact on the respondent and it doesn't make sense to remove either one of them from the model as we need to know the impact of both variables

  • 1
    Welcome to CV. Since you’re new here, you may want to take our [tour], which has information for new users. High correlation might not be necessarily a problem. Have you checked vif? Here is a related question: https://stats.stackexchange.com/questions/38093/how-to-deal-with-high-correlation-among-predictors-in-multiple-regression – T.E.G. Dec 20 '17 at 08:18
  • Please register &/or merge your accounts (you can find information on how to do this in the **My Account** section of our [help]), then you will be able to edit & comment on your own question. – gung - Reinstate Monica Dec 20 '17 at 12:25
  • @T.E.G I have checked it. it is under 4. I have build a lm model and used VIF function in cars package.does it mean that the model is fine?? – himanshu sharma Dec 20 '17 at 09:41

1 Answers1

0

Beyond the preventive VIF Analysis (for a good post) you can use the shrinkage regression (LASSO for Example) to exclude the regressors with high correlation

  • 3
    Shrinkage methods can be very helpful but don't help directly with co-linearities. Lasso for example can be harmed by it. Co-linearities can make lasso select features somewhat randomly. It is often better to meet the problem head-on using an initial unsupervised learning step, e.g., variable clustering or sparse principal components. – Frank Harrell Dec 20 '17 at 13:32