How are the standard errors of coefficients derived in a regression?

Question

I'm seeking a little help understanding the derivation of the standard error of the coefficients in linear regression. Most of the derivation has already been documented in the answers by ocram to How are the standard errors of coefficients calculated in a regression?, but there is one step I can't follow.

As background, we have the model: $$ \left| \begin{array}{l} \mathbf{y} = \mathbf{X} \mathbf{\beta} + \mathbf{\epsilon} \\ \mathbf{\epsilon} \sim N(0, \sigma^2 \mathbf{I}), \end{array} \right.$$

And we calculate (a):

$$ \widehat{\textrm{Var}}(\hat{\mathbf{\beta}}) = \hat{\sigma}^2 (\mathbf{X}^{\prime} \mathbf{X})^{-1}, $$

Using (b):

$$\hat{\mathbf{\beta}} = (\mathbf{X}^{\prime} \mathbf{X})^{-1} \mathbf{X}^{\prime} \mathbf{y}.$$

And (c):

$$ {\textrm{Var}}({\mathbf{Ay}}) = \mathbf{A}{\textrm{Var}}({\mathbf{y}}) \mathbf{A}^{\prime}, $$

Where (d):

$${\mathbf{A}} = (\mathbf{X}^{\prime} \mathbf{X})^{-1} \mathbf{X}^{\prime}$$

From (c) & (d), I can tell that ${\textrm{Var}}({\mathbf{y}})$ must be an [$n$ by $n$] matrix.

But ${\mathbf{why}}$ is the Variance of a vector with dimension $n$ defined as an [$n$ by $n$] matrix?

I'm getting stuck on the following issues:

(1) What is the definition of ${\textrm{Var}}({\mathbf{y}})$?

(2) Why does it need to be defined that way (and why is it $n$ by $n$)?

(3) How does one calculate it?

(4) What is the general result of that calculation in this case?

I hope that these are easy questions to answer for someone versed in multi-variable statistical theory. Thanks in advance for your help!

You are asking two different things. Do you want the Var(${\bf y}$), which is the variance-covariance matrix of the responses (and is, by model assumptions, equal to the n by n matrix ${\bf \sigma^2 I}$). Is it the variance of ${\it one}$ observation, so that it is, by definition, $\sigma^2$? Or is it (as per your title) the standard errors of the coefficients, which you obtain by taking the square root of each variance of a coefficient, which in turn are the diagonals of ${\bf \sigma^2 (X'X)^{-1}}$? — AlaskaRon, Oct 10 '15 at 21:22
I see your point. Thanks. I'm now just looking for intuition about the use of $Var(y)$. And I've now started to get comfortable with it based on mark999's recommendation to read Wikipedia on "Covariance_matrix". — ClarPaul, Oct 10 '15 at 22:21

How are the standard errors of coefficients derived in a regression?

0 Answers0