How to derive the covariance matrix of $\hat\beta$ in linear regression?

Question

I just read this very insightful post about ridge regression, where the author stated that the variance of $\hat\beta$ is:

$$\text{var}(\hat\beta) = \sigma^2(\textbf{X}^\prime \textbf{X})^{-1}.$$

I couldn't figure out why it is like this. Can anyone elaborate a bit?

score 5 · Answer 1 · answered Apr 22 '18 at 03:12

The covariance result you are looking at occurs under a standard regression model using ordinary least-squares (OLS) estimation. The OLS estimator (written as a random variable) is given by:

$$\begin{equation} \begin{aligned} \hat{\boldsymbol{\beta}} &= (\boldsymbol{x}^{\text{T}} \boldsymbol{x})^{-1} (\boldsymbol{x}^{\text{T}} \boldsymbol{Y}) \\[6pt] &= (\boldsymbol{x}^{\text{T}} \boldsymbol{x})^{-1} \boldsymbol{x}^{\text{T}} (\boldsymbol{x} \boldsymbol{\beta} + \boldsymbol{\varepsilon}) \\[6pt] &= \boldsymbol{\beta} + (\boldsymbol{x}^{\text{T}} \boldsymbol{x})^{-1} \boldsymbol{x}^{\text{T}} \boldsymbol{\varepsilon}. \end{aligned} \end{equation}$$

In the standard linear regression model we have $\mathbb{E}(\boldsymbol{\varepsilon}) = \boldsymbol{0}$ and $\mathbb{V}(\boldsymbol{\varepsilon}) = \sigma^2 \boldsymbol{I}$ so that the estimator is unbiased with covariance matrix given by:

$$\begin{equation} \begin{aligned} \mathbb{V}(\hat{\boldsymbol{\beta}}) &= \mathbb{V}((\boldsymbol{x}^{\text{T}} \boldsymbol{x})^{-1} \boldsymbol{x}^{\text{T}} \boldsymbol{\varepsilon}) \\[6pt] &= ((\boldsymbol{x}^{\text{T}} \boldsymbol{x})^{-1} \boldsymbol{x}^{\text{T}} ) \mathbb{V}(\boldsymbol{\varepsilon}) ((\boldsymbol{x}^{\text{T}} \boldsymbol{x})^{-1} \boldsymbol{x}^{\text{T}} )^{\text{T}} \\[6pt] &= \sigma^2 ((\boldsymbol{x}^{\text{T}} \boldsymbol{x})^{-1} \boldsymbol{x}^{\text{T}} ) \boldsymbol{I} ((\boldsymbol{x}^{\text{T}} \boldsymbol{x})^{-1} \boldsymbol{x}^{\text{T}} )^{\text{T}} \\[6pt] &= \sigma^2 ((\boldsymbol{x}^{\text{T}} \boldsymbol{x})^{-1} \boldsymbol{x}^{\text{T}} ) ((\boldsymbol{x}^{\text{T}} \boldsymbol{x})^{-1} \boldsymbol{x}^{\text{T}} )^{\text{T}} \\[6pt] &= \sigma^2 (\boldsymbol{x}^{\text{T}} \boldsymbol{x})^{-1} (\boldsymbol{x}^{\text{T}} \boldsymbol{x}) (\boldsymbol{x}^{\text{T}} \boldsymbol{x})^{-1} \\[6pt] &= \sigma^2 (\boldsymbol{x}^{\text{T}} \boldsymbol{x})^{-1}. \end{aligned} \end{equation}$$

Note that this is the conditional covariance of the estimator given the design matrix $\boldsymbol{x}$.

sjw · Answer 2 · 2018-04-22T03:03:29.983

2

Four things to note:

$\hat{\beta} =(\textbf{X}^\prime\textbf{X})^{-1}\textbf{X}^\prime\textbf{Y}$

$\text{var}({\textbf{A}\textbf{Y}})=\textbf{A}\text{var}(\textbf{Y})\textbf{A}^\prime$

$\text{var}(\textbf{Y}|\textbf{X})=\sigma^2\textbf{I}$ (actually, everything is conditioned on $\textbf{X}$)

$(\textbf{X}^\prime\textbf{X})^{-1}$ is symmetric.

edited Apr 22 '18 at 03:03

answered Apr 21 '18 at 17:27

sjw

5,091
1
21
45

$\text{var}(Y) \neq \sigma^2\textbf{I}$, unless $Y$ is the error term (or $\beta = 0$). – jbowman Apr 22 '18 at 00:22
@jbowman thank you, I edited to condition on $\textbf{X}$ – sjw Apr 22 '18 at 02:43

How to derive the covariance matrix of $\hat\beta$ in linear regression?

2 Answers2

Related