I'm using the mtcars dataset in R, I used the car packages to estimate the VIF, but since I have factor variables I got the vif table with GVIF and GVIF1/(2⋅df) values, in another question Which variance inflation factor should I be using: $\text{GVIF}$ or $\text{GVIF}^{1/(2\cdot\text{df})}$?, John Fox, co-author of https://www.tandfonline.com/doi/abs/10.1080/01621459.1992.10475190#.U2jkTFdMzTo, mentioned that they recommend using GVIF^(1/(2*Df)), but I don't know if I should use the rule of thumb of <5 with the standard VIF or I should use another number.
This is my code:
mtcars2 <- within(mtcars, {
vs <- factor(vs, labels = c("V", "S"))
am <- factor(am, labels = c("Automatic", "Manual"))
cyl <- ordered(cyl)
gear <- ordered(gear)
carb <- ordered(carb)
})
mtcars2$loghp <- log(mtcars2$hp)
mtcars2 <- mtcars2 %>%
dplyr::mutate(cylnum = as.numeric(mtcars2$cyl))%>%
dplyr::mutate(cylcat = cut(cylnum, breaks = c(0, 1, 2, 3),
labels = c("Cyl_4", "Cyl_6", "Cyl_8")))
mtcars2_lm <- mtcars2[, c(1,2,3,5,6,7,8,9,10,11,12)]
model1 <- lm(mpg ~., data = mtcars2_lm)
vif(model1)
GVIF | Df | GVIF^(1/(2*Df)) | |
---|---|---|---|
cyl | 98.027045 | 2 | 3.146563 |
disp | 57.217057 | 1 | 7.564196 |
drat | 7.105793 | 1 | 2.665669 |
wt | 23.490085 | 1 | 4.846657 |
qsec | 10.731794 | 1 | 3.275942 |
vs | 7.354487 | 1 | 2.711916 |
am | 9.936800 | 1 | 3.152269 |
gear | 50.681013 | 2 | 2.668157 |
carb | 244.371502 | 5 | 1.733026 |
loghp | 14.626620 | 1 | 3.824476 |