1

I have 200 continuous variables, all positively correlated to the dependent variable. My problem is that I can't exclude any from my model as I need to attribute to each and everyone a weight. They are also correlated between themselves. RF is great for prediction but does not provide weights,Linear has the issue of multicolinearity and Ridge doesn't help much. Any suggestion on how to best tackle this problem?

user4797853
  • 63
  • 1
  • 5

1 Answers1

2

If you're not concerned with interpretability, you can try Principal Components Analysis to extract a low-dimensional subspace with maximal variation, but I'm surprised ridge regression didn't help with the multicollinearity. Perhaps try LASSO instead?

See also answers here:

Dealing with correlated regressors

https://stats.stackexchange.com/a/38095/135759

How to deal with multicollinearity when performing variable selection?