3

I'm using Python's module to calculate the VIF for my variables to be used in a binary logistic regression. I'm completely following this post to do this: https://etav.github.io/python/vif_factor_python.html.

With my data, I got a VIF of 1600+ for the intercept, which looks very weird for me (I have used VIF in R before but never seen it). Is it something normal or should I do something about it? Other variables seems normal except for one that has a slightly higher VIF.

To add more context, my response variable is highly unbalanced, it's mostly (~99%) 0 and only 1% positive. I got a feeling that this might be the case since the intercept is all one's.

enter image description here

Any suggestions, helps are welcomed! Please let me know if you need more content as well.

TYZ
  • 755
  • 4
  • 17

1 Answers1

3

Yes and no. Depends on the scale of the outcome, depends how close the grand mean is to zero (and if the other covariates are centered_.

If you're asking whether something is algorithmically wrong with that number, then the answer is likely no. It is simple computation with an exact analytic solution.

The number is closely related to the concept of removing the intercept from the model. Removing the intercept from a model makes very little sense in most cases, as evidenced by this apparently large and meaningless number. VIF of 1600 tells you how variable the residuals would be if you removed a grand mean from among the number of predictors of the outcome.

Also related When is it ok to remove the intercept in a linear regression model?

AdamO
  • 52,330
  • 5
  • 104
  • 209
  • I understand that it doesn't make sense to remove intercept from the model since it might cause the VIF to go negative. I don't recall having the VIF for intercept when I was doing it in R, does this mean the VIF for intercept is not that important? – TYZ Jan 08 '19 at 14:47
  • 1
    @YilunZhang VIF cannot be negative by definition. What a software chooses to (and not to) report does not indicate whether a measure is important. Despite that, I can safely say VIF for the intercept is not important. – AdamO Jan 08 '19 at 15:24
  • Thanks for the final clarification! I think I now feel comfortable to move forward :) – TYZ Jan 08 '19 at 15:33
  • standard vif is based only on the design matrix X and is independent of the response variable. – Josef Jul 08 '20 at 00:43