3

Usually instrumental variables are introduced as a means to solve the problem $E(u|X)\neq 0$ in the model $Y = X'\beta + u$. This may happen if we omit important variables from the covariate vector $X$, for instance. However, it is always the case that we can write $Y = E(Y|X) + v$ with $E(v|X)=0$. Moreover, $E(Y|X)$ is always of the form $g\circ X$ for some measurable function $g$. That is, the model $Y = g\circ X + v$ with $E(v|X)=0$ is always correctly specified (even though we may not know what $g$ looks like). Omitted variables play no role here, as it is a probabilistic fact. Thus the problem is not really that we have omitted `important' variables or that $E(Y - X'\beta\, |\, X)\neq 0$, but rather that we are too narrow minded in demanding that $g$ be a linear function! Now, estimating the function $g$ is precisely the aim of nonparametric regression methods. My question is as follows: is there any reason (aside from tradition) to stick with the instrumental variables framework (which is in my view very clumsy) in lieu of adopting nonparametric regression models to describe relationships between variables? (This is a honest question in spite of its seemingly provocative tone)

user127022
  • 103
  • 5

1 Answers1

4

IV methods are still useful as nonlinearity is by far not the only important way in which misspecification may arise, and hence estimating $g$ is not always what we aim for. Basically, you mention it yourself when writing that there may be omitted variables and that $u\neq v$.

Let me try a time-honored economic example, "returns to schooling":

There is probably a positive univariate relationship between earnings and the years of schooling you received (it is also probably nonlinear, as adding a PhD to an MSc probably does not have the same marginal benefit as adding a BSc to a high school degree, but that is not important for the argument). You could then estimate this conditional expectation $g$ by nonparametric methods (see, e.g., here), which would tell you how much more an average, say, MSc graduate makes than an average BSc graduate.

This is, however, very likely not the causal effect of the master's degree (i.e., the difference in earnings is not solely due to the MSc degree) as different types of people choose to receive or not an additional degree. One may expect MSc graduates to be (on average!) more intelligent (which, for example, makes the idea of having to sit more exams relatively less unappealing), to have more perseverance etc. than BSc graduates.

These are also qualifications that will be useful in a later career, hence boost earnings. Hence, these MSc graduates will later not only earn more because they learned useful stuff during their MSc, but also simply because they are who they are.

Hence, $E(u|\text{yrs of schooling})\neq0$, where $u$ represents things like ability: as ability is not part of the set of regressors, it is part of the error term, and, as discussed above, we expect (i.e., on average) more highly schooled employees to also be more intelligent.

Instruments may help us get out of this problem.

Christoph Hanck
  • 25,948
  • 3
  • 57
  • 106
  • would you explain non-parametric regression in short ? –  May 10 '16 at 09:18
  • That would not be directly relevant for this answer, in my opinion. But I added a link, thanks. – Christoph Hanck May 10 '16 at 09:23
  • Christoph, I understand the reasoning of omitted variables. The thing is, when we write $Y = g\circ X + v$, where $g$ is the regression function of $Y$ on $X$, _it is necessarily true_ that $E(v|X)=0$ since $v = Y - g\circ X$ . Thus $g$ represents the mean _net_ effect of $X$ on $Y$. In your example, there is some function $g$ that relates $X =$ "years of schooling" with $Y =$ "earnings" through $Y = g\circ X + v$ with $E(v|X)=0$. I guess what you are trying to say is that we are not always interested in net effects? Thank for your response. – user127022 May 10 '16 at 11:48
  • 1
    Exactly, that is what I tried to say. The deviation from the conditional expectation is in general not the same as the error $u$ of what is also called the "structural" (as opposed to the "reduced-form") model. – Christoph Hanck May 10 '16 at 11:52
  • In any case (and to keep things simple lets stick to our example), letting $Y = $ 'earnings', $X = $ 'years of schooling', $Z = $ 'other important covariates', one can always write $Y = m(X, Z) + U$ with $E(U\,|\,X,Z) = 0$. Thus even if $X$ and $Z$ are correlated, for each fixed $z$ the the function $x\mapsto m(x,z)$ should inform us of causal relationships between $X$ and $Y$. My point is: one can estimate the regression function $m$ via nonparametric methods without worrying too much about misspecification. – user127022 Aug 18 '16 at 14:37
  • @user127022: your reasoning assumes that we observe $z$, while the issue in misspecified models often is that important covariates are not observed. In that case, nonparametric methods will not save us either. – Christoph Hanck Sep 09 '16 at 16:05
  • @ChristophHanck: writing $Z = (Z_1, Z_2)$, where $Z_1=$ 'observable covariates' and $Z_2 =$ 'unobservable covariates', then both models $Y=m(X,Z)+U$ with $E(U|X,Z)=0$, and $Y=g(X,Z_1)+V$ with $E(V|X, Z_1)=0$, are correctly specified ($m$ and $g$ being the regression map of $Y$ respectively on $X,Z$ and on $X,Z_1$). The map $x\mapsto m(x,z)$ informs us of net effects of schooling on earnings, controling for all important covariates, whereas $x\mapsto g(x,z_1)$ is the same but controling only for **observable** covariates. The problem is we want to control for things like IQ which are in $Z_2$? – user127022 Sep 11 '16 at 16:12
  • 1
    Yes, that is what I am trying to say - if you want the (whether it is nonlinear or not) *causal* effect of another year of schooling, you need to include some measure of intelligence/motivation/etc in your regression. Else, the coefficient on/nonparametric estimate of an additional year of schooling will be confounded due to the fact that more intelligent/motivated students choose to receive more schooling *and* tend to do better in the labor market partially irrrespective of their additional schooling. – Christoph Hanck Sep 14 '16 at 04:39