7

I stumbled over a rather simple result of OLS regression which is

$$ Var[\hat\beta] = \sigma^2(X^TX)^{-1} $$ where $\sigma$ is the variance of the error term $u$ and $X$ is the regressor matrix.

I first just accepted the proof in my textbook but now I am thinking that it either uses sloppy notation or I am missing something. $\hat\beta$ is the estimated and $\beta$ is the true parameter (assuming unbiasedness).

It states that

\begin{align} Var[\hat\beta] &= E[(\hat\beta - \beta)(\hat\beta-\beta)^T] \\ &= E[(X^TX)^{-1}X^Tuu^TX(X^TX)^{-1}] \\ &= (X^TX)^{-1}X^T E[uu^T] X(X^TX)^{-1} \end{align}

but $X$ was only assumed to be exogenous and not non-stochastic. Under this assumption I think $X$ cannot be dragged outside the expectation operator.

Momentarily, I think that it should be $Var[\hat\beta|X]$ to make sense. Is that the case? My web research couldn't clarify this. I only found similar derivations to the above without further explainations.

Alex
  • 297
  • 3
  • 11
  • This looks like frequentist statistics and thus data or observations should be random, not the inputs or regressors, shouldn't they? The data are random because there is random error added. – gwr Nov 28 '15 at 11:40
  • A couple of pages before this derivation the book authors state that assuming $X$ as nonstochastic is "frequently not a reasonable assumption" . Therfore, this assumption was relaxed to exogeneity. – Alex Nov 28 '15 at 11:46
  • In section 6 of [this tutorial](http://web.stanford.edu/~mrosenfe/soc_meth_proj3/matrix_OLS_NYU_notes.pdf) it gives that the simplification only holds if $X$ is assumed non-stochastic. – gwr Nov 28 '15 at 12:03
  • This would support my assumption that if we assume only exogeneity the correct notation would have been $Var[\hat\beta|X]$ and not $Var[\hat\beta]$. But why is this fact skipped so often? Seems not correct to me. – Alex Nov 28 '15 at 13:42
  • 1
    It seems that under the assumptions usually made the usual estimate for the variance given that $X$ is fixed is also an unbiased estimate of the unconditional estimate when $X$ is random as is shown [here](http://thestatsgeek.com/2013/08/30/why-regression-inference-assuming-fixed-predictors-is-still-valid-for-random-predictors/). Maybe that helps. – gwr Nov 28 '15 at 15:14

1 Answers1

4

You are right that the conditional variance is not generally the same as the unconditional one. By the variance decomposition lemma, which says that, for r.v.s $X$ and $Y$

$$ Var(X)=E[Var(X|Y)]+Var[E(X|Y)] $$ Translated to our problem, $$ Var(\widehat{\beta})=E[Var(\widehat{\beta}|X)]+Var[E(\widehat{\beta}|X)] $$ Now, using that OLS is conditionally unbiased (under suitable assumptions like exogeneity assumed here), we have $$ E(\widehat{\beta}|X)=\beta $$ and thus $$ Var[E(\widehat{\beta}|X)]=0,$$ as $\beta$ is a constant, so that $$ Var(\widehat{\beta})=E[Var(\widehat{\beta}|X)] $$ or $$Var(\widehat{\beta})=E[\sigma^2(X'X)^{-1}]=\sigma^2E[(X'X)^{-1}].$$

Christoph Hanck
  • 25,948
  • 3
  • 57
  • 106
  • Oh, just realized that this is pretty much what the link of @gwr says... – Christoph Hanck Nov 28 '15 at 16:37
  • 1
    +1 I am only a *biased* Bayesian and thus have forgotten long ago about what unbiased estimators really are good for and thus had to use secondary material to answer anyways. As a pun: Your concise answer is better for the long run... :-) – gwr Nov 28 '15 at 16:50