Prove Estimated Regression Coefficients are the same with or without an intercept term

Question

I am mostly having trouble with this question in that I don't think I know where to start and I am not confident in my answer that I have at the moment. The answer I have currently is mostly just intuitive and I would have to type an essay to explain it. I am looking for something that is more mathematically based.

The question is:

Prove that the estimated regression coefficients for a linear regression with an intercept term are identical to those obtained for the same linear regression equation without an intercept term, but for which all variables are replaced by deviation from their mean values.

Please add the `self-study` tag, and read its [tag-wiki](http://stats.stackexchange.com/tags/self-study/info), modifying your question as explained there (including the answer you're not confident in, and explaining where you think you need help with it). Please also use a more informative title. Every question here is a "statistics question". — Glen_b, Sep 06 '15 at 10:15
The result follows straightforwardly from the observations that $$\operatorname{cov}(X-a,Y-b) = \operatorname{cov}(X,Y)$$ and $$\operatorname{var}(X-a) = \operatorname{var}(X),$$ that is, _translation_ of the variables in general (centering is a specific case) does not change variances and covariances, and so has no effect on the regression coefficients. What _does_ change is the _intercept_ which becomes $0$ if all the variables are translated to have zero mean. — Dilip Sarwate, Sep 06 '15 at 14:12

Deep North · Accepted Answer · 2015-09-07T03:04:32.250

all variables are replaced by deviation from their mean values

I think the question is about the centering independent variables. To make our life easier, we just use linear regression with one predictor variable and intercept.

Suppose the simple linear regression is:

$Y_i=\beta_0 +\beta_1x_i+e_i$

$e_i\sim N(0,\sigma^2)$

Treat $\beta_0 +\beta_1x_i$ as a constant, it is not difficult to see:

$Y_i\sim N(\beta_0 +\beta_1x_i,\sigma^2)$ (useless for this proof)

We can write the model as matrix format:

$Y=\beta X+e$

$X$ is the design matrix with two columns: $\begin{bmatrix} 1 & x_1\\ 1 & x_2 \\ ...& ...\\ 1 & x_n \end{bmatrix}$

By OLS:

We can get $\beta=(X'X)^{-1}X'Y$

Here $\beta=(\beta_0,\beta_1)'$

Below is the proof:

Next we will see what will happen when we center the $x_i$

Let $x^*=x_i-\bar{x}$ (here $x^*$s are deviations from their mean values)

$\bar{x}=\frac{\sum x_i}{n}$

$Y_i=\beta_0^* + \beta_1^*x^*+ e_i$

Now the design matrix $X^*$ is :

$\begin{bmatrix} 1 & x_1-\bar{x}\\ 1 & x_2-\bar{x} \\ ...& ...\\ 1 & x_n-\bar{x} \end{bmatrix}$

Still by OLS we can calculate

$\beta ^*=(X^{*'}X^*)^{-1}X^{*'}Y$

Now let us see what happened with $\beta^*$

$X^{*'}X^*=\begin{bmatrix} 1 & 1&... &1 \\ x_1-\bar{x} &x_2-\bar{x}&... &x_n-\bar{x} \end{bmatrix}*\begin{bmatrix} 1 & x_1-\bar{x}\\ 1 & x_2-\bar{x} \\ ...& ...\\ 1 & x_n-\bar{x} \end{bmatrix}=\begin{bmatrix} n & \sum (x_i-\bar{x})\\ \sum (x_i-\bar{x}) & \sum (x_i-\bar{x})^2 \end{bmatrix}$

Noted $\sum (x_i-\bar{x})=0$

$X^{*'}X^*=\begin{bmatrix} n & 0\\ 0 & \sum (x_i-\bar{x})^2 \end{bmatrix}$

$\therefore (X^{*'}X^*)^{-1}=\begin{bmatrix} 1/n & 0\\ 0 & 1/\sum (x_i-\bar{x})^2 \end{bmatrix}$

$\beta^*=(X^{*'}X^*)^{-1}X^{*'}Y=\begin{bmatrix} 1/n & 0\\ 0 & 1/\sum (x_i-\bar{x})^2 \end{bmatrix}*\begin{bmatrix} 1 & 1&... &1 \\ x_1-\bar{x} &x_1-\bar{x}&... &x_1-\bar{x} \end{bmatrix}*\begin{bmatrix} y_1\\ y_2\\ ...\\ y_n\end{bmatrix} =\begin{bmatrix} \frac{\sum y_i}{n} \\ \frac{\sum (x_i-\bar{x})y_i}{\sum (x_i-\bar{x})^2} \end{bmatrix}=\begin{bmatrix} \beta_0^*\\ \beta_1^* \end{bmatrix}$

Here we can see $\beta_0^*$(intercept) is $\bar{Y}$ which is just the grand mean.

Next we need to show $\beta_1^*=\beta_1$ noted:$\beta_1$ is the coefficient without centering and we know that for original $x_i$s, $\beta_1=\frac{\sum(x_i-\bar{x})(y_i-\bar{y})}{\sum(x_i-\bar{x})^2}$ i.e $\frac{S_{xy}}{S_{xx}}$

Here is a good reference as how to calculate $\beta$s for linear regression.

In page 4 you can see $b_1$(here is $\beta_1)$$=\frac{\sum(x_i-\bar{x})(y_i-\bar{y})}{\sum(x_i-\bar{x})^2}$

and $b_0$(here is $\beta_0)$$=\bar{y}-b_1\bar{x}$

Comparing with $\beta_1^*=\frac{\sum (x_i-\bar{x})y_i}{\sum (x_i-\bar{x})^2}$

We need to show that $\sum(x_i-\bar{x})y_i=\sum(x_i-\bar{x})(y_i-\bar{y})$

This is not difficult since $\sum(x_i-\bar{x})(y_i-\bar{y})=\sum (x_i-\bar{x})y_i-\bar{y}*\sum (x_i-\bar{x})$. Note, the last term is just zero.

$\therefore$ $\beta_1^*=\beta_1$.

Summary, by above calculations, we can see when centering the $x_i$s only the intercept changed to grand mean other coefficients will not change.

##############################

Based on Dilip's comment if you don't want to use matrix and do not consider intercept use the following formula will be much straightforward.

$\beta_1=\frac{\sum(x_i-\bar{x})(y_i-\bar{y})}{\sum(x_i-\bar{x})^2}=\frac{Cov(x,y)}{Var(x)}=\frac{Cov(X-a)(Y-b)}{Var(X-a)}$

I enjoyed going through your derivation. Very nice. I wonder if you can consider a hyperlink for the $\beta_1$ formula when you say, '... and we know that for original $x_i$s,...' — Antoni Parellada, Sep 06 '15 at 14:46
I am having trouble understanding how you obtained Y=(x`x)^-1x`y by OLS. I have never used this design matrix format so that is most likely where I am lost at. — NarphXCIX, Sep 06 '15 at 19:39

score 3 · Answer 2 · edited Sep 07 '15 at 06:41

The regression equation $y = \beta_0 +\beta_1 x + \epsilon$ is estimated by minimizing the sum of squared $\sum_i (y_i - \beta_0 - \beta_1 x_i) ^ 2$.

When you derive this equation with respect to $\beta_0$ and put that equal to zero (to find the minimum) then you find that $\bar{y} = \beta_0 + \beta_1 \bar{x}$. (see below).

Subtracting the latter equation from the regression equation you find $(y-\bar{y}) = \beta_1 (x-\bar{x}) +\epsilon$.

This is the regression equation without an intercept but with the variables as deviations from the mean.

As you see the $\beta_1$ is the same in the 'original' equation.

Note that,

(1.1) For the 'original equation' we found that $\bar{y} = \beta_0 + \beta_1 \bar{x}$. Or that $\beta_0 = \bar{y}-\beta_1 \bar{x}$.

(1.2) For the equation in deviation form: $(y-\bar{y}) = \beta_1 (x-\bar{x})$ can be re-written as $y = (\bar{y} - \beta_1 \bar{x}) + \beta_1 x$ which is the same intercept. So if you estimate $\beta_1$ from the equation $(y-\bar{y}) = \beta_1 (x-\bar{x})+\epsilon$ then this is the same $\beta_1$ as estimated from the 'original' equation. If you want to estimate $\beta_0$ from $(y-\bar{y}) = \beta_1 (x-\bar{x})+\epsilon$; this can be done as $\beta_0 = \bar{y}-\beta_1 \bar{x}$.

=====

Proof that minimising $\sum_i (y_i - \beta_0 - \beta_1 x_i) ^ 2$ yields $\bar{y} = \beta_0 + \beta_1 \bar{x}$.

Deriving $\sum_i (y_i-\beta_0 -\beta_1 x_i)^2$ with respect to $\beta_0$ gives $2 \sum_i (y_i -\beta_0 -\beta_1 x_i) (-1) = 0$. It follows that $\sum_i y_i -n \beta_0 -\sum_ix_i =0$ and after dividing by $n$ you find $\bar{y} = \beta_0 +\beta_1 \bar{x}$.

This is a good solution. you can move $\bar{y}$ to right hand too, which is the new intercept. — Deep North, Sep 07 '15 at 04:54
@Deep North: I did not do that because the question asked for 'deviations from the mean'. Note also that the intercept is $\bar{y} -\beta_1 \bar{x}$ not $\bar{y}$.. — , Sep 07 '15 at 05:06
$\bar{y} -\beta_1 \bar{x}$ is for original $x_i$s, $\bar{y}$ is the intercept after centering. I mean this one $y-\bar{y} = \beta_1 (x-\bar{x}) +\epsilon$ — Deep North, Sep 07 '15 at 05:07
@Deep North: after centering $x$ **and** $y$ there is no more intercept and the question was to show that '... without an intercept term' (see the question). — , Sep 07 '15 at 05:10
@Deep North: :-) I read your answer and you get a (+1) for it — , Sep 07 '15 at 05:34

Prove Estimated Regression Coefficients are the same with or without an intercept term

The question is:

2 Answers2

Linked