2

I have three dependent variables,

  • $X_1$ (my 'main' independent variable),
  • $X_2$
  • $X_3$ where

When I test $X_1$ against the response variable y in a bivariate regression, the results are not statistically significant.

However, when I include $X_2$ and $X_3$ in a multivariate regression, all three predictors become significant. After further examination I realized that $X_2$ and $X_3$ are highly correlated. but neither of them are independently correlated with $X_1$.

Can the multicollinearity of $X_2$ and $X_3$ have affected $X_1$ or how am I suppose to interpret these results?

I am looking for an explanation in layman terms.

Ferdi
  • 4,882
  • 7
  • 42
  • 62
F.Knutas
  • 21
  • 3
  • 1
    What has probably happened is that $X_2$ and $X_3$ have predicted the bit of $Y$ which was not related to $X_1$. Now when you put $X_1$ in it correlates with the remainder of $Y$ but this was previously masked by the other chunk which has now been removed. – mdewey Dec 31 '16 at 16:51
  • @mdewey okay, thanks! But how should I describe the results of this model? Could I say that X1 together with X2 and X3 can predict some variation in y or is this to far fetched? (The adjusted R2 value is .123) – F.Knutas Dec 31 '16 at 17:18
  • Don't the answers at http://stats.stackexchange.com/questions/28474 clear this up? If not, then please tell us where you would like additional explanation or how your question is really different from that one. Also, it appears you are using "multivariate" where you mean "multiple": do you actually have more than one response variable in your model? – whuber Dec 31 '16 at 18:26
  • @whuber, thank you, that was helpful! And no, sorry, I do not have more than one response, I mean multiple. However, I still do not really understand how to describe these results more theoretically. If the added variables might have absorbed some of the residual variability and increased the significance of the main IV, what does this mean for my model as a whole? Can I still conclude that X1 together with X2 and X3 can predict some of the variation in my response variable? – F.Knutas Dec 31 '16 at 19:43
  • 1
    It depends on how hard you were looking for a model. If your original aim was to include all three $X_i$, then you have three significant predictors and you should include them. If instead you were fishing around to find some combination that might look significant (which is fine to do, but suggests an *exploratory* rather than a *confirmatory* analysis), then you ought (at least) to adjust your p-value to account for the number of possible models you might have looked at. Many answers (and comments) by [Frank Harrel](http://stats.stackexchange.com/users/4253/frank-harrell) address this issue. – whuber Dec 31 '16 at 19:47
  • My original aim was to control for other possible explanations to my response, but I didn't expect them to make the original IV significant again.. But I guess that the variance in my response can then only be explained when all three are included? Thank you so much for the layman's explanation! You may have saved my disastrous attempt at finishing this thesis! Happy new year! – F.Knutas Dec 31 '16 at 20:02

0 Answers0