What's the difference between "mean independent" and independent?

Question

As stated in the Econometrics textbook (Introductory Econometrics by Wooldbridge):

When $E(u|x)=E(u)$ holds, we say that $u$ is mean independent of $x$.

Why can't we simply say that $u$ is independent of $x$? Is it possible to have $u$ completely independent of $x$?

Answers to this previously asked question might help? http://stats.stackexchange.com/questions/205894/how-is-mean-independence-defined — Dragonfly, Apr 05 '17 at 12:55

Glen_b · Accepted Answer · 2017-04-05T13:09:43.120

The condition $E(u|x)=E(u)$ is not the same thing as independence in general.

It's implied by independence (when those expectations exist) but it can be true when you don't have independence.

Consider, for example, the case where some other aspect of the distribution changes with $x$ without changing the mean -- then you'd have dependence but mean-independence. (One example would be where the conditional variance was not constant.)

[If you're asking why did the author invoke mean-independence rather than independence, it's hard to say much without context; presumably only the weaker condition was necessary for whatever was being done.]

ColorStatistics · Answer 2 · 2018-12-09T22:05:18.387

As the direction of the arrows in the image below indicates, independence implies mean independence, which in turn implies zero correlation. The converse statements are not true: zero correlation does not imply mean independence, which in turn doesn't imply independence.

Intuitively, independence of u and x would mean that for each value of x, the conditional density function of u given x is identical. Mean independence is less restrictive as it is a one number summary of the values of u, for each level of x. To be more exact, mean independence between u and x would mean that for each value of x, a one number summary of the values of u, the average weighted by the conditional density function of u given x, is constant. You can now see intuitively that while the conditional density function of u given x may be different across different values of x, a one number summary of the conditional density function of u given x can be the same across all levels of x. This would be the case when u and x are not independent but are mean independent.

You ask "Why can't we simply say that u is independent of x? Is it possible to have u completely independent of x?" The answers to your questions are 1) Because mean independence suffices for our purposes, and 2) Yes, but we wouldn't want to do that. Let me expand on each of these.

The context of the assumption you reference, E(u│x)=E(u), is that of Assumptions for Simple Regression, and the u stands for the error term, or equivalently the factors that affect our dependent variable y, that are not captured by x. When combined with the assumption E(u)=0, a trivial one as long as an intercept is included in the regression, we obtain that E(u│x)=0. We care about this result because it implies that the population regression line is given by the conditional mean of y: E(y|x)=β0 + β1*x.

If we were conducting a randomized controlled experiment, we would make sure to assign the values of x randomly, which would make x and u independent, which would then imply the weaker condition of mean independence, which would in turn imply zero correlation. In a sense, in a randomized controlled experiment, we get the independence of x and u for free. In contrast, in observational data we must assume how the unobserved factors, u, are related in the population to the factor we control for, x.

We could assume that x and u are independent, however that is too strong of an assumption. That would be equivalent to assuming that for each level of x, the conditional probability density function of u given x would be exactly the same and equal to the marginal probability density function of u. By assuming that the conditional probability density function of u given x would be exactly the same for each level of x, we would be assuming that all the moments of the distribution of u are exactly the same for each level of x. We would therefore be assuming that the first and second moments of the distribution of u would also be identical for each level of x. The latter of these is precisely the assumption of homoscedasticity, which says that the variance of u does not depend on the value of x:

Therefore, if we were to assume that x and u were independent, we'd be assuming homoscedasticity, and a lot more (think of the moments higher than 2). However, homoscedasticity, for one, is not as key of an assumption, as it has no bearing on the unbiasedness of the regression coefficients, so we'd want it as a separate assumption that we may or may not want to make. We wouldn't want this "nice to have" assumption mixed in with the most vital assumption of OLS, E(u│x)=E(u).

At the other extreme, we could assume that x and u are uncorrelated. In fact, that suffices for deriving the OLS estimates. However, that would not be sufficient to conclude that the population regression is given by the conditional mean E(y|x)=β0 + β1*x.

Good lecture notes that I used for inspiration: http://statweb.stanford.edu/~adembo/math-136/Orthogonality_note.pdf

Color, mean independence *is not* an assumption for linear regression. *Regression errors are always mean independent of $X$*. See https://stats.stackexchange.com/questions/321180/is-the-linearity-assumption-in-linear-regression-merely-a-definition-of-epsilo/321243#321243 — Carlos Cinelli, Dec 12 '18 at 05:39
Mean independence is invoked for identification of structural parameters. — Carlos Cinelli, Dec 12 '18 at 05:40
In the diagram, under U and X are mean independent, it should read E(u|X)=E(u). — user244731, Apr 14 '19 at 16:29
@CarlosCinelli: can you clarify what you mean by "Regression errors are always mean independent of X". In what context? Prediction, causal analysis in experimental setting, or causal analysis using observational data? Or perhaps in all? Why is that the case? The link you quoted didn't clear things for me. Thank you. — ColorStatistics, Jul 19 '20 at 15:14
By construction regression errors are mean independent of the regressors. This holds in any case, it’s an algebraic fact. — Carlos Cinelli, Jul 21 '20 at 15:21
Hmm... I see Wooldridge saying on page 25 of Introductory Econometrics "The crucial assumption is that the average value of u does not depend on the value of x". Are you saying that what he calls the crucial assumption is a fact that is always true? I don't think he would call it "crucial assumption" if that was the case. — ColorStatistics, Jul 21 '20 at 15:32
In experimental setting, I agree that u is mean independent of x, by design, but I don't see how that would be the case in an observational setting. — ColorStatistics, Jul 21 '20 at 15:44
@CarlosCinelli: I'd like to understand your perspective. Can you refer me to a textbook where they describe mean independence as an algebraic fact? Thank you. — ColorStatistics, Jul 22 '20 at 12:00
The error term Wooldridge is referring to is a structural (causal) error term, not a regression error term. Most econometrics books are a mess regarding this, see my previous answers. One book that gets it correctly though is “Mostly Harmless Efonometrics” from Angrist and Pischke, see Theorem 3.1.1 page 32. — Carlos Cinelli, Jul 22 '20 at 15:29
Thank you, Carlos. I will look it up in "Mostly Harmless Econometrics" to understand this better. — ColorStatistics, Jul 22 '20 at 15:39

What's the difference between "mean independent" and independent?

2 Answers2

Linked