1

In Mostly Harmless Econometrics, Equation (4.1.12) states that in IV setting with binary instrument $Z$, treatment $D$, and potential outcomes $Y_1,Y_0$ for $Y$, then $$E[Y_1-Y_0|D_1 = 1, D_{0} = 0] = \frac{Cov(Y,Z)}{Cov(D,Z)} = \frac{E[Y|Z=1]-E[Y|Z=0]}{E[D|Z=1] - E[D|Z=0]}$$

I follow the derivations for the first equality, but why does the second equality hold?

Thank you.

Johnny
  • 11
  • 2
  • 1
    Does this answer your question? [Proving the LATE Theorem of Angrist and Imbens 1994](https://stats.stackexchange.com/questions/116916/proving-the-late-theorem-of-angrist-and-imbens-1994) – dimitriy Jul 11 '20 at 20:43
  • They dont seem to answer the exact question I had, but will continue to read them carefully since they seem very helpful to understand these issues generally -- thank you! – Johnny Jul 11 '20 at 20:50
  • 1
    The second step follows from the definition of univariate regression. In a regression of A on B, where the slope gives you the change in expected value of Y from B increasing by 1, the slope is just Cov(A,B)/Var(B). You have a ratio of two regression slope coefficients with Var(Z) in the denominator, so those drop out. – dimitriy Jul 11 '20 at 20:56
  • 1
    Just to add to @DimitriyV.Masterov's comment, in case you're worried about the fact that you may not want to assume a linear relationship, note that with binary treatment, a linear regression is a fully saturated model, so you're not actually assuming any linearity. – doubled Jul 11 '20 at 21:36

1 Answers1

1

This follows directly from the binary nature of $Z$ and the use of the law of iterated expectations. Consider an arbitrary $W$. Then $$Cov(W,Z) = E[W(Z-E[Z])] = E[W(1-E[Z])|Z=1]E[Z] - E[WE[Z]|Z=0](1-E[Z])$$

where we use the fact that for binary $Z$, $E[Z] = P(Z=1),$ and so $P(Z=0) = 1-P(Z=1) = 1-E[Z]$. Then it suffices to use linearity of conditional expectations to observe that

$$ E[W(1-E[Z])|Z=1]E[Z] = E[W|Z=1]E[Z](1-E[Z])$$ and $$ E[WE[Z]|Z=0](1-E[Z]) = E[W|Z=0]E[Z](1-E[Z])$$

so the last expression in the first equation simplifies to $$E[W(1-E[Z])|Z=1]E[Z] - E[WE[Z]|Z=0](1-E[Z]) = \bigg(E[W|Z=1]-E[W|Z=0]\bigg)E[Z](1-E[Z])$$

since this holds for any $W$, take $W=Y$ and $W=D$ and take the ratio. Then

$$\frac{Cov(Y,Z)}{Cov(D,Z)} = \frac{\bigg(E[Y|Z=1]-E[Y|Z=0]\bigg)E[Z](1-E[Z])}{\bigg(E[D|Z=1]-E[D|Z=0]\bigg)E[Z](1-E[Z])} = \frac{E[Y|Z=1]-E[Y|Z=0]}{E[D|Z=1]-E[D|Z=0]}$$

as required.

doubled
  • 4,193
  • 1
  • 10
  • 29