4

What does it mean to talk about the expectation of the product of the error term and an independent variable? Like, why do we even need to mention $E(e_i X_{ik})$? What is it actually describing or what is the intuition behind it?

So for a linear regression model

$Y_i = \beta_1 + \beta_2 X_{i2} + e_i$

When people talked about strict exogenity $E(e_i | x_{i2}) = 0$, I found that they often mention something like $E(e_i X_{i2}) = 0$ or $E(e_i Y_i) = 0$, and I just don't understand why. I mean, how did they jump from conditional expectation to the exception of a product, and why?

This question confused me for months. Thank you very much for your help in advance!

T. G.
  • 232
  • 1
  • 5

1 Answers1

4

Let me write the model as $y=\beta_0+\beta_1x+\dots+\beta_px_p+e$, or $y=\mathbf{x}^T\boldsymbol{\beta}+e$, $\mathbf{x}=(1,x_1,\dots,x_p)$, $\boldsymbol{\beta}=(\beta_0,\dots,\beta_p)$.

$E[e\mathbf{x}]$ follows from $E[e\mid \mathbf{x}]=0$. In general, if $E[e\mid \mathbf{x}]=0$, then

  • $E[e]=0$ by the law of total expectation: $$E[e]=E[E[e\mid \mathbf{x}]]=E[0]=0$$
  • $E[f(\mathbf{x})e]=0$, where $f(\mathbf{x})$ is an arbitrary finite valued function, by the same law: $E[f(\mathbf{x})e]=E[E[f(\mathbf{x})e\mid\mathbf{x}]]$, but when $\mathbf{x}$ is given, $f(\mathbf{x})$ is given too, so: $$E[f(\mathbf{x})e]=E[E[f(\mathbf{x})e\mid\mathbf{x}]]=E[f(\mathbf{x})E[e\mid\mathbf{x}]]=0$$
  • $E[\mathbf{x}e]=0$: let $f$ be the identity function.
  • $E[y\mid\mathbf{x}]=\mathbf{x}^T\boldsymbol{\beta}$.

If $E[e]=0$ and $E[\mathbf{x}e]=0$, then $\mathbf{x}$ and $e$ are uncorrelated: $$\text{Cov}(\mathbf{x},e)=E[\mathbf{x}e]-E[\mathbf{x}]E[e]=0$$

What does it mean to talk about the expectation of the product of the error term and an independent variable? Like, why do we even need to mention E(eiXik)? What is it actually describing or what is the intuition behind it?

In other words: what happens if $\mathbf{x}$ and $e$ are not uncorrelated?

Let's say that the 'true' model is: $$y=\beta_0+\beta_1x_1+\beta_2x_2+e$$ but your model is: $$y=\beta_0+\beta_1x_1+u$$ where $u=\beta_2x_2+e$.

If $x_1$ and $x_2$ are correlated, say $x_2=cx_1$, then:

  • $E[u]=E[\beta_2x_2+e]=E[\beta_2cx_1+e]=\beta_2cE[x_1]\ne 0$
  • $E[y\mid x_1]\ne \beta_0+\beta_1x_1$
  • $\hat\beta_0$ and $\hat\beta_1$ will be biased and inconsistent.
Sergio
  • 5,628
  • 2
  • 11
  • 27