Which threshold should I use for GVIF1/(2⋅df)? (Variance Inflation Factor)

Question

I'm using the mtcars dataset in R, I used the car packages to estimate the VIF, but since I have factor variables I got the vif table with GVIF and GVIF1/(2⋅df) values, in another question Which variance inflation factor should I be using: $\text{GVIF}$ or $\text{GVIF}^{1/(2\cdot\text{df})}$?, John Fox, co-author of https://www.tandfonline.com/doi/abs/10.1080/01621459.1992.10475190#.U2jkTFdMzTo, mentioned that they recommend using GVIF^(1/(2*Df)), but I don't know if I should use the rule of thumb of <5 with the standard VIF or I should use another number.

This is my code:

mtcars2 <- within(mtcars, {
  vs <- factor(vs, labels = c("V", "S"))
  am <- factor(am, labels = c("Automatic", "Manual"))
  cyl  <- ordered(cyl)
  gear <- ordered(gear)
  carb <- ordered(carb)
})

mtcars2$loghp <- log(mtcars2$hp) 


mtcars2 <- mtcars2 %>%  
  dplyr::mutate(cylnum = as.numeric(mtcars2$cyl))%>%  
  dplyr::mutate(cylcat = cut(cylnum, breaks = c(0, 1, 2, 3),
                             labels = c("Cyl_4", "Cyl_6", "Cyl_8")))

mtcars2_lm <- mtcars2[, c(1,2,3,5,6,7,8,9,10,11,12)]
model1 <- lm(mpg ~., data = mtcars2_lm)
vif(model1)

	GVIF	Df	GVIF^(1/(2*Df))
cyl	98.027045	2	3.146563
disp	57.217057	1	7.564196
drat	7.105793	1	2.665669
wt	23.490085	1	4.846657
qsec	10.731794	1	3.275942
vs	7.354487	1	2.711916
am	9.936800	1	3.152269
gear	50.681013	2	2.668157
carb	244.371502	5	1.733026
loghp	14.626620	1	3.824476

For deciding which variables I should delete from my model, I have read that the rule of thumb for continuous variables is 5, so you keep all variables that have a <5 value in its vif score. — Begdev, Jan 10 '22 at 19:40
Please read over [this thread](https://stats.stackexchange.com/q/168622/28500) for reasons why you might not need to worry about VIF much at all. High multicollinearity might lead to high-magnitude covariances among coefficient estimates, but for a predictive model that's not really a problem. — EdM, Jan 10 '22 at 19:53
Also keep in mind that, yes, you might decrease your variance by getting stable coefficient estimates, but that could be at the expense of biasing your model to discount a predictor that does matter, perhaps so much bias that the decrease in variance is not worth the increase in bias. — Dave, Jan 10 '22 at 20:06

Which threshold should I use for GVIF1/(2⋅df)? (Variance Inflation Factor)

0 Answers0