Why are the Degrees of Freedom for multiple regression n - k - 1? For linear regression, why is it n - 2?

Question

I'm reading a textbook and I see this question:

So there are 200 women, and the DF is 196, implying that the equation for DF is $n - k - 1$. There are 3 variables: bp, age, and type so $k == 3$. What's the intuition behind this?

Also, why is the degrees of freedom for linear regression n - 2?

if n=200 and k=3 df =196 which is n-k-1.as you mention in your question and not 4n-k-1 as you mention inside your question. Get this straightened out and then we can consider the explanation. Use the self study tag. — Michael R. Chernick, May 01 '17 at 21:03
Based on your edit I would say that the example leading to n-2 has only 2 parameters but the text book question has 4 parameters. — Michael R. Chernick, May 01 '17 at 21:06
Many, many threads address this question: see https://stats.stackexchange.com/search?q=n-k++regression. — whuber, Mar 09 '21 at 19:20

Matthew Gunn · Answer 1 · 2017-05-01T21:18:54.893

In linear regression, the degrees of freedom of the residuals is:

$$ \mathit{df} = n - k^*$$

Where $k^*$ is the numbers of parameters you're estimating INCLUDING an intercept. (The residual vector will exist in an $n - k^*$ dimensional linear space.)

If you include an intercept term in a regression and $k$ refers to the number of regressors not including the intercept then $k^* = k + 1$.

Notes:

It varies across statistics texts etc... how $k$ is defined, whether it includes the intercept term or not.)
My notation of $k^*$ isn't standard.

Examples:

Simple linear regression:

In the simplest model of linear regression you are estimating two parameters:

$$ y_i = b_0 + b_1 x_i + \epsilon_i$$

People often refer to this as $k=1$. Hence we're estimating $k^* = k + 1 = 2$ parameters. The residual degrees of freedom is $n-2$.

Your textbook example:

You have 3 regressors (bp, type, age) and an intercept term. You're estimating 4 parameters and the residual degrees of freedom is $n - 4$.

@Jwan622 [The intercept](http://blog.minitab.com/blog/adventures-in-statistics-2/regression-analysis-how-to-interpret-the-constant-y-intercept) is another term you have to estimate. — Matthew Gunn, May 02 '17 at 15:45

score 1 · Answer 2 · edited Mar 09 '21 at 20:42

1

If I have $n$ observations, the data could have gone $n$ ways, but say I am estimating for 3 variables (including intercept), then really it could have only gone $(n-3)$ ways as I already have estimates of 3 things which control the data. That's my way of looking at it

edited Mar 09 '21 at 20:42

MarianD

1,493
2
8
17

answered Mar 09 '21 at 19:18

JONATHAN

11
1

Why are the Degrees of Freedom for multiple regression n - k - 1? For linear regression, why is it n - 2?

2 Answers2

Examples:

Simple linear regression:

Your textbook example:

Linked

Related