Under the Assumptions of the Simple Linear Regression Model, Why Is This Term a Chi-Square Random Variable with $n - 2$ Degrees of Freedom?

Question

I am told that, under the assumptions of the simple linear regression model, $\dfrac{n\hat\sigma^2}{\sigma^2} \sim \chi^2_{(n - 2)}$. Why is $\dfrac{n\hat\sigma^2}{\sigma^2}$ a chi-square random variable with $n - 2$ degrees of freedom? It seems like it resembles the sum of $n$ squared standard normal variables of the form $\dfrac{\hat\sigma^2}{\sigma^2}$, so my previous knowledge leads me to think that it should have $n$ degrees of freedom? Although, I'm not convinced that it is a squared standard normal variable, since it does not totally resemble the form $\left( \dfrac{X - \bar{X}}{\sigma} \right)^2$.

My question stems from this image:

From this page.

Since the other terms take the form of a squared standard normal variable, I understand why they are chi-squared random variables with 1 degree of freedom. However, I do not understand why the last term has $n - 2$ degrees of freedom.

I would greatly appreciate it if people could please take the time to clarify this.

score 1 · Accepted Answer · answered Mar 16 '18 at 01:55

1

One degree of freedom is used to calculate $\bar{x}$, and one degree of freedom is used to calculate $\bar{y}$. These, in turn are used to calculate $\hat{\beta}$ and $\hat{\alpha}$ as well as their sample standard errors. Hence, $n-2$.

answered Mar 16 '18 at 01:55

Alexis

26,219
5
78
131

Ahh, I think I understand now. Thank you for the assistance. – The Pointer Mar 16 '18 at 02:46

score 1 · Answer 2 · answered Mar 16 '18 at 03:48

Considering the general linear regression model with an error term and $m$ explanatory variables, you can derive the distribution of the variance estimator by presenting it as a scaled quadratic form of the error term. Letting $h \equiv x (x^\text{T} x)^{-1} x^\text{T}$ denote the hat matrix, and noting that $h x = x$ gives:

$$\begin{equation} \begin{aligned} (n-m-1) \hat{\sigma}^2 = \text{SSE} &= || Y - \hat{Y} ||^2 \\[4pt] &= Y^\text{T} (I - h) Y \\[4pt] &= \varepsilon^\text{T} (I - h) \varepsilon \\[4pt] &\sim \sigma^2 \cdot \text{Chi-Sq}(\text{df} = \text{Rank}(h)) \\[4pt] &= \sigma^2 \cdot \text{Chi-Sq}(\text{df} = n-m-1). \\[4pt] \end{aligned} \end{equation}$$

The case for the simple linear regression occurs when $m=1$, when we have:

$$\begin{equation} \begin{aligned} (n-2) \frac {\hat{\sigma}^2}{\sigma^2} \sim \text{Chi-Sq}(\text{df} = n-2). \end{aligned} \end{equation}$$

Understanding this deeply requires learning a bit about quadratic forms for normal random vectors. If you just want a rough intuitive explanation then it suffices to note that for each $\beta$ parameter we estimate in the regression, we lose one degree-of-freedom.

Under the Assumptions of the Simple Linear Regression Model, Why Is This Term a Chi-Square Random Variable with $n - 2$ Degrees of Freedom?

2 Answers2

Linked