Unbiasedness of Covariance Matrix Estimator in OLS

Question

I want to prove that $V$ is an unbiased estimator of the covariance matrix $$(X'X)^{-1}(X'DX)(X'X)^{-1},$$ where $D=diag(\sigma^2,...,\sigma^2)=E(ee'|X)$ in a linear model.

$$V = \frac{n}{n-k}(X'X)^{-1}\left(\sum_{i=1}^n{X_iX'_i\hat{e}^2_i}\right)(X'X)^{-1}$$

To do so, I first find the conditional expectation of $V$.

$$E[V|X]= \frac{n}{n-k}(X'X)^{-1}\left(\sum_{i=1}^n{X_iX'_iE(\hat{e}^2_i}|X)\right)(X'X)^{-1}$$

However, I am not sure how to proceed from here.

Christoph Hanck · Answer 1 · 2021-11-17T08:22:29.490

I do not think your claim is correct. You are analyzing what MacKinnon and White (1985) refer to as the heteroskedasticity-robust variance estimator $HC_1$. I will argue that, instead, the estimator they call $HC_2$ is unbiased when in fact no heteroskedasticity is present, while $HC_1$ only is in a special case. Here is more detail:

Notice that the residuals satisfy $\hat e=My$ for $M=I-H$ and hat matrix $$H=\{h_{ij}\}_{i,j=1,\ldots,n}=\{x_i'(X'X)^{-1}x_j\}_{i,j=1,\ldots,n}=X(X'X)^{-1}X'.$$ Hence, $$ \hat e_i=(1-h_{ii})y_i-\sum_{j\neq i}h_{ij}y_j, $$ so that $$\hat e_i^2=\left[(1-h_{ii})y_i-\sum_{j\neq i}h_{ij}y_j\right]^2$$ In conditional expectation and under random sampling (i.i.d.) with conditional homoskedasticity - assumptions you do not state but that I need - $$ \begin{align*} E(\hat e_i^2|X)&=Var(\hat e_i|X) \end{align*}$$ in view of $E(\hat e|X)=ME(y|X)=MX\beta=0$. Now, again by the i.i.d. assumption,

\begin{align*} Var(\hat e_i|X)&=(1-h_{ii})^2Var(y_i|X)+\sum_{j\neq i}h_{ij}^2Var(y_j|X)\\ &=(1-h_{ii})^2\sigma^2+\sum_{j\neq i}h_{ij}^2\sigma^2, \end{align*}

as covariances are zero and all conditional variances are assumed to be the same. Next, notice that $$ \sum_{j=1}^nh_{ij}^2=h_{ii},$$ as, by symmetry of $H$, \begin{align*} \sum_{j=1}^nh_{ij}^2&=\sum_{j=1}^nh_{ij}h_{ji}\\ &=\sum_{j=1}^nx_i'(X'X)^{-1}x_jx_j'(X'X)^{-1}x_i\\ &=x_i'(X'X)^{-1}\sum_{j=1}^nx_jx_j'(X'X)^{-1}x_i\\ &=x_i'(X'X)^{-1}(X'X)(X'X)^{-1}x_i\\ &=x_i'(X'X)^{-1}x_i\\ &=h_{ii}\\ \end{align*}

Hence, by multiplying out and rearranging, \begin{align*} E(\hat e_i^2|X)&=(1-2h_{ii}+h_{ii}^2)\sigma^2+\sum_{j\neq i}h_{ij}^2\sigma^2\\ &=(1-h_{ii})\sigma^2-h_{ii}\sigma^2+\sum_{j=1}^nh_{ij}^2\sigma^2\\ &=(1-h_{ii})\sigma^2, \end{align*} Hence, under homoskedasticity, unbiasedness would call for adjusting squared residuals by $1-h_{ii}$, not multiplying by $n/(n-k)$.

Only in a "balanced design" (note that $\sum_ih_{ii}=k$, so that $k/n$ is the average value of the $h_{ii}$) in which $h_{ii}=k/n$ for all $i$ would the two coincide, as we would then have $$ \frac{1}{1-h_{ii}}=\frac{1}{1-k/n}=\frac{n}{n-k} $$ So all in all, for $$ HC_2=(X'X)^{-1}\left(\sum_{i=1}^n{x_ix'_i\frac{\hat{e}^2_i}{1-h_{ii}}}\right)(X'X)^{-1} $$ we obtain \begin{align*} E(HC_2|X)&=(X'X)^{-1}\left(\sum_{i=1}^nx_ix'_i\sigma^2\right)(X'X)^{-1}\\&=\sigma^2(X'X)^{-1} \end{align*} Hence, $HC_2$'s conditional expected value equals the conditional variance of $\hat\beta$ under homoskedasticity, which is different from the unconditional variance, see here.

Here is a little simulation to illustrate. I take the regressor to be fixed in repeated samples here to avoid the distinction between conditional and unconditional variance discussed above.

I find that the bias of $HC_2$ tends to be an order of magnitude smaller, although both biases are very small for the designs considered here.

library(sandwich)

n <- 50
x <- rnorm(n, sd=3)
x <- rt(n, df=2)
x <- runif(n, -10, 10)

sigma <- .2

mc.function <- function(n){
  u <- rnorm(n, sd=sigma)
  y <- 2*x + u
  limo <- lm(y~x-1)
  return(c(vcovHC(limo, "HC1"), vcovHC(limo, "HC2")))
}

true.cond.var <- sigma^2/sum(x^2)

vcovs <- replicate(1000, mc.function(n))
(bias <- rowMeans(vcovs-true.cond.var))
abs(bias[1]) > abs(bias[2])

I believe that when you *sum* the squared residuals, though, you obtain a quantity closely related to the trace of $H,$ which depends only on $n$ and $p,$ *regardless of the design,* thereby leading to the claim in the question. — whuber, Nov 16 '21 at 17:31
Yeah, $tr(H)=tr(X(X'X)^{-1}X')tr((X'X)^{-1}X'X)=k$ only depends on $k$, which gives rise to the unbiasedness of the standard estimator $\sum_i\hat e_i^2/(n-k)$ under homoskedasticity, eg https://stats.stackexchange.com/questions/76738/proof-that-regression-residual-error-is-an-unbiased-estimate-of-error-variance. I am not sure how that applies here, though, since the heteroskedasticity robust estimator operates on the product $x_i\hat e_i$, so that we need to work on individual squared residuals, not their sum. See also https://www.econstor.eu/bitstream/10419/189084/1/qed_wp_0537.pdf — Christoph Hanck, Nov 16 '21 at 18:30
The part I did not catch--and now I see you asked for clarification--is whether the model is heteroscedastic or not. If, at the outset, you were to stipulate that you are discussing the heteroscedastic model, there would be less chance of misunderstanding. (+1) — whuber, Nov 16 '21 at 19:39

Unbiasedness of Covariance Matrix Estimator in OLS

1 Answers1

Linked

Related