3

Suppose I've a model such as

$Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \cdots + \beta_k X_k + \epsilon$.

Now, there's high correlation between $X_1$ & $X_2$ (say over 60% but below 75%). Does that means this model has multicollinearity problem? Is there any relationship between highly correlated variables & multicollinearity? If there's any short literature on this topic?

Nick Cox
  • 48,377
  • 8
  • 110
  • 156
Beta
  • 5,784
  • 9
  • 33
  • 44
  • Several relevant threads here e.g. http://stats.stackexchange.com/questions/1149/is-there-an-intuitive-explanation-why-multicollinearity-is-a-problem-in-linear-r – Nick Cox Dec 19 '13 at 17:48
  • and this one is relevant too http://stats.stackexchange.com/q/70899/3277 – ttnphns Dec 19 '13 at 19:27
  • Also http://stats.stackexchange.com/questions/38093/how-to-deal-with-high-correlation-among-predictors-in-multiple-regression – Peter Flom Dec 19 '13 at 19:42
  • Thank you everyone for your responses. I actually looked for my question in the forum before asking this one. But I didn't get it earlier. So, had to ask this question. Thank you again. – Beta Dec 20 '13 at 07:07

2 Answers2

4

The variance inflation factor (VIF) quantifies the severity of multicollinearity in an ordinary least squares regression analysis:

$$ VIF=\frac{1}{1-r^2} $$

Where r is the correlation between two independent variables such as $X1$ and $X2$ (Technically, $r^2$ is called the coefficient of determination, but it equals the squared correlation). We usually say there's collinearity if $VIF \geq 10$. In your case, $VIF=\frac{1}{1-0.75^2}=2.29$. So we can say there's no collinearity problem between $X1$ and $X2$. If you use R for modeling, the VIF can be easily checked by vif(fit).

David Z
  • 1,288
  • 2
  • 15
  • 23
2

Correlation is neither necessary nor sufficient for collinearity problems, although perfect correlation will cause problems. The best way to test for collinearity is with condition indices.

See my answer

Peter Flom
  • 94,055
  • 35
  • 143
  • 276
  • 2
    Peter, when you can just link to another answer, the question almost certainly is a duplicate and ought to be closed as such. – whuber Dec 19 '13 at 19:29