4

I am giving a talk to a group that contains both quantitative and qualitative researchers, but most of the quants' understanding stops at crosstabs and regression.

Does anyone have any ideas which intuitively explain the mechanics of 2sls and instrumental variables?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467

1 Answers1

1

The label 2SLS provides the most intuitive explanation I am aware of: two stage least squares. The name is due to the following possibility to calculate $\widehat{\delta}_{\text{2SLS}}$:

  1. Regress every regressor $z_i$ on all the instruments $x_i$ and compute the fitted values $\widehat{Z}=P_{X}Z$.
  2. Regress $y_{i}$ on $\widehat{z}_{i}$. The resulting estimator is $\widehat{\delta}_{\text{2SLS}}$.

Step 1 can be interpreted as extracting the part of the variation in the regressors that is uncorrelated with the errors of regression model, $\epsilon_i$, because, by assumption, the instruments are uncorrelated with the error, while the regressors themselves may not be due to endogeneity. In step 2, we then use that exogenous part of the variation in the regressors to estimate our parameter of interest $\delta$.

This procedure indeed yields 2SLS because the fitted values from step 1 are $\widehat{Z}=P_{X}Z$ and hence \begin{eqnarray*} (\widehat{Z}'\widehat{Z})^{-1}\widehat{Z}'y&=&(Z'P_{X}'P_{X}Z)^{-1}Z'P_{X}'y\\ &=&(Z'P_{X}Z)^{-1}Z'P_{X}y\notag\\ &=&\widehat{\delta}_{\text{2SLS}} \end{eqnarray*}

Christoph Hanck
  • 25,948
  • 3
  • 57
  • 106