2

Let's say I have one endogenous variable $X_1$ in the linear model

$$ Y=X_1\beta $$

and two instrumental variables $Z_1$ and $Z_2$ (strongly correlated with $X_1$ but not $Y$).

I compute the two-stage least squares in the following way:

$$ \widehat{\beta}_{2SLS} = [X'Z(Z'Z)^{-1}Z'X]^{-1}[X'Z(Z'Z)^{-1}Z'y] $$

I'm trying to understand the number of degrees of freedom in this situation in order to correct the calculation of the sample variance of my final regression model. I have two options:

  1. According to Multiple linear regression degrees of freedom, I would have $N-2$ degrees of freedom.

  2. However, because during the first stage of 2SLS I regress $X_1$ on the $Zs$, i.e., I run OLS on the linear model

$$ X_1=\delta_1Z_1 + \delta_2Z_2 $$

and in this case we have two predictor variables ($Z_1$ and $Z_2$), so perhaps I have $N-3$ degrees of freedom.

Any hints about which one works here?

JC1
  • 217
  • 1
  • 8

1 Answers1

2

Either is fine, asymptotically. Recall that the main goal in the degrees of freedom corrections of the error variance estimate in OLS is to render the estimate $s^2$ an unbiased estimator of $\sigma^2$, and also a required ingredient to $t$- and $F$-finite-sample distribution theory in normal regression models.

Now, finite-sample properties for IV estimators are, outside toy models, either unwieldy or plain unavailable, such that asymptotic approximations are needed.

In particular, it can be shown that

$$\frac{1}{n}\sum_i(y_i-x_i\hat\beta_{2SLS})^2$$

is consistent for $\sigma^2$. Now, rescaling this expression by $\frac{n}{n-K}$ for any finite $K$ such as 2 or 3 will not matter asymptotically, as $\frac{n}{n-K}\to1$.

Christoph Hanck
  • 25,948
  • 3
  • 57
  • 106