1

In the OLS model, we assume that $E(X'U)=0$ (with $u$ being the error term), which comes from $E(U|X=x)=0$, providing us that $E(U)=0$ and $cov(x_i, u)=0$ $\forall x_i$. I understand this argument intuitively, as $E(U|X=x)=0$ would be violated if $x$ and $u$ were correlated, e.g. $u$ increases with $x$ would cause $E(U|X=x)>0$ for large $x$. But I also know that this is not sufficient for a formal proof, and one is not presented in Woolridge (2010).

I am also wondering if this proof goes the other way as well, i.e. does $E(x'u)=0 \Rightarrow E(U|X=x)=0$ given that $E(u)=0$.

I'm guessing these are both fairly straightforward (thus why they were omitted in the text) but an explanation or a hint in the right direction would be appreciated.

1 Answers1

2

No, $E[UX] = 0$ and $E[U] = 0$ do not imply $E[U|X = x] = 0, ~\forall x$. Here is a simple counter-example.

Imagine $X$ is standard gaussian, and that $E[U|X = x] = x^2 - 1$. Note $E[U|X = x] \neq0$ for all $x\neq 1$.

But,

$$ E[UX] = E[E[UX|X]] = E[E[U|X]X] = E[(X^2 - 1)X] = E[X^3 - X] = 0 $$

Also note $E[U] = E[E[U|X]] = E[X^2 - 1] = 1 - 1 = 0$.

Thus, we have $E[UX] = 0$, $E[U] = 0$, with $E[U|X = x] = x^2 -1 \neq 0$ in general.

PS: In an OLS model, you do not need to assume $E[U|X] = 0$. The population linear regression is simply the best linear approximation to the conditional expectation function $E[Y|X]$ (in the sense of minimizing mean squared error). You invoke exogeneity for identification of structural parameters, see here.

Carlos Cinelli
  • 10,500
  • 5
  • 42
  • 77