1

I was using a fixed-effects panel model with interaction effects when I realized that the VIF values are too high for some variables. I was advised to standardize the predictor variables to mitigate multi-collinearity. My question is that can I standardize just 1 predictor variable or must I standardize all predictor variables?

If so, are there any academic sources/literatures that I could refer to for this matter?

Helix123
  • 1,265
  • 9
  • 15
user130252
  • 11
  • 1

2 Answers2

2

Standardizing just slides the variables up or down (to make their means equal to $0$), and squeezes or stretches their scales, to make the resulting SDs equal to $1$. It doesn't change the relationship between variables. However, multicollinearity is about the relationship between the variables. As a result, standardizing has no effect on multicollinearity.

If your problem with multicollinearity is due to creating product terms or interaction terms, standardizing can help if you standardize before you create the new terms. That's the only case, though. It's also possible to get collinearity with the intercept, but that doesn't matter—you can ignore that, if that's the diagnostic you're worried about.

In short, standardizing is a red herring here. It may help you to read this CV thread: When conducting multiple regression, when should you center your predictor variables & when should you standardize them?

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
  • Please see request for information about CV.SE work [here](https://stats.meta.stackexchange.com/questions/6205/). – Ben Aug 27 '21 at 06:36
0

In this excellent answer the author goes into the derivation of the covariance between X and the interaction term XY. He shows that when there is total independence of X and Y $Cov(X,XY)=\sigma^2_x \mu_y$. Standardizing Y alone causes $\mu_y=0$ and thus reduces the covariance in this case to zero. I don't know how to evaluate the case for when $Cov(X,Y)!=0$. But at the bottom the author gives some intuition, about how XY will be big if X is big and thus they will be correlated. This suggests we would like X and Y both to be standardized, because either one of the variables could induce a correlation between the variable and XY.

confused student
  • 451
  • 1
  • 2
  • 8