2

I am revisiting the basic notions of linear regression and stumbled upon the following idea in Cameron and Trivedi's Microeconometrics book:

However, for the conditional mean to be linear in x, so that $E[y|x] = α+xγ$, requires the assumption that $E[u|x] = 0$, in addition to $E[u] = 0$ and $Cov[x,u] = 0$.

They way I thought about linear prediction is that it is the same as the conditional mean (or conditional expectation function) if that is actually linear (otherwise it is still the best linear prediction) . What I fail to understand is how exactly $E[u|x]=0$ guarantees that the conditional mean $E[y|x]$ is linear.

What if the underlying distribution is non-linear? The conditional mean should be non-linear as well. How does $E[u|x]=0$ change that? Is this somehow related to the idea of omitted variables in the actual estimation (e.g. via OLS)?

Mervin
  • 31
  • 3
  • Maybe this example helps: https://stats.stackexchange.com/questions/190703/non-linear-endogeneity/190800#190800 – Christoph Hanck Oct 26 '18 at 10:59
  • Thank you. If I understand it correctly, the confusion was that I did not see that $E[u|x]=0$ is an assumption implying linearity of the CEF. Should that assumption not hold true, we would still have (at least) the best linear predictor. Is this correct? – Mervin Oct 26 '18 at 13:03
  • Yes, OLS is a consistent estimator of the best linear projection of $y$ on $X$. Omitted variable considerations come into play once we start worrying if that linear projection also corresponds to an underlying structural model that we hope to be able to give a causal interpretation. – Christoph Hanck Oct 26 '18 at 14:51

0 Answers0