Thus far, I have removed collinear variables as part of the data preparation process by looking at correlation tables and eliminating variables that are above a certain threshold. Is there a more accepted way of doing this? Additionally, I am aware that only looking at correlation amongst 2 variables at a time is not ideal, measurements like VIF take into account potential correlation across several variables. How would one go about systematically choosing variable combinations that do not exhibit multicollinearity?
I have my data within a pandas data frame and am using sklearn's models.