I have 200 continuous variables, all positively correlated to the dependent variable. My problem is that I can't exclude any from my model as I need to attribute to each and everyone a weight. They are also correlated between themselves. RF is great for prediction but does not provide weights,Linear has the issue of multicolinearity and Ridge doesn't help much. Any suggestion on how to best tackle this problem?
Asked
Active
Viewed 52 times
1 Answers
2
If you're not concerned with interpretability, you can try Principal Components Analysis to extract a low-dimensional subspace with maximal variation, but I'm surprised ridge regression didn't help with the multicollinearity. Perhaps try LASSO instead?
See also answers here:
Dealing with correlated regressors
https://stats.stackexchange.com/a/38095/135759
How to deal with multicollinearity when performing variable selection?

Maximilian Aigner
- 725
- 4
- 12
-
Seems more a comment than an answer. Can you elaborate a little bit further? We are looking for high-quality answers. – utobi Nov 28 '16 at 14:12
-
Thank you for answering but I am indeed concerned with interpretability. – user4797853 Nov 28 '16 at 14:18
-
@utobi I would, but low rep doesn't allow me to post comments. – Maximilian Aigner Nov 28 '16 at 14:18
-
For commenting you need to wait until you've got enough reputation points that you gain by placing proper answers. – utobi Nov 28 '16 at 14:22