Lurking variables probably have something to do with this. I'm just trying to figure out how their difference can affect a linear model.
1 Answers
The (potentially causal) interpretation of your model doesn't come from the model itself. It comes from the design / setup of your study. Causality can primarily be inferred from your model when you ran a true experiment. There are various methods for attempting to infer causality with observational data (e.g., instrumental variables, difference-in-differences, propensity scores, etc.), but they all require additional assumptions and are not as strong as experiments in general. If you don't have a true experiment, it is safest to always assume your model estimates a marginal association only.
Omitted / lurking variables affect this when they are correlated with $X$ variables in your model and with your response ($Y$) variable. In that case, they bias your estimates such that a variable could appear causal when it actually isn't. To understand this better, it may help to read my answer here: Estimating $b_1x_1+b_2x_2$ instead of $b_1x_1+b_2x_2+b_3x_3$.

- 132,789
- 81
- 357
- 650
-
I'm a little confused by the link you posted. Are you saying that correlation affects the error? So if I have a variable that is correlated then I have a bias in the model? Sorry, I'm a bit confused... – LSerrano113 Dec 09 '14 at 21:39
-
@Carla, yes that's the idea. If there is a variable x3 that is correlated w/ both another x variable (x2) & is correlated w/ y, then there can be problems. Eg, x2 can be causally unrelated to Y, but will look like it causes Y. – gung - Reinstate Monica Dec 09 '14 at 21:44