1

Suppose we have $Y = \beta_{0} + \beta_{1}X1 + \beta_{2}X2 + \epsilon$ we have a estimator $\beta$ for this model.

Now we substitute $\tilde{Y} = Y - \bar{Y}$( Y - mean of (Y)) and $\tilde{X1} = X1 - \bar{X1}$ (X1 - mean of (X1)) and similar for X2, and the model becomes: $\tilde{Y} = \beta_{1}\tilde{X1} + \beta_{1}\tilde{X2} + \tilde{\epsilon}$

I know that this step will not affect the estimator of $\beta_{1},\beta_{2}$, but I cannot come up with a solid proof.

  • 1
    https://stats.stackexchange.com/questions/507163/why-does-applying-a-linear-transformation-to-a-covariate-change-regression-coeff/507178#507178 answers a generalization of this question: when you modify your variables by means of a linear transformation, the model is the same. Subtracting the mean *in a model that has an intercept* is a linear transformation, *QED.* – whuber Feb 03 '22 at 20:08

1 Answers1

1

(using a bit more general notation)

For $Y$ is $n\times 1$ and $X$ is $n \times p$ we have the centering matrix $C_n = I_n - \tfrac{1}{n}J_n$ where $I_n$ is the $n\times n$ identity matrix and $J_n$ is the $n \times n$ matrix of ones. Your centered matrices can then be computed as $\tilde{Y}=C_nY$ and $\tilde{X}=C_nX$. The least squares solution of regressing $X$ on $Y$ is

$$ \hat{\beta} = (X^\intercal X)^{-1}X^\intercal Y $$

while the least squares solution of regressing $\tilde{X}$ on $\tilde{Y}$ is

$$ \begin{align*} \hat{\tilde{\beta}} &= (\tilde{X}^\intercal \tilde{X})^{-1}\tilde{X}^\intercal \tilde{Y}\\ &=((C_nX)^\intercal C_nX)^{-1}(C_nX)^\intercal (C_nY)\\ &=(X^\intercal C_n^\intercal C_n X)^{-1}X^\intercal C_n^\intercal C_nY\\ &= (X^\intercal X)^{-1}X^\intercal Y\\ &= \hat{\beta} \end{align*} $$ where $C_n^\intercal = C_n$ and $C_nC_n = I_n$, i.e. $C_n$ is idempotent (you should be able to show this).

QED as they say.

bdeonovic
  • 8,507
  • 1
  • 24
  • 49
  • 1
    Most of this algebra is unnecessary: the crux of the matter is that an invertible linear transformation of the variables does not change a model that is linear in those variables. Thus, after you show how the centered and uncentered variables are related, *you are done.* It doesn't even matter that there is an explicit formula for the solution or that $C_n$ is idempotent. – whuber Feb 03 '22 at 20:10
  • but sometimes algebra is fun – bdeonovic Feb 03 '22 at 20:16
  • 2
    Chacun à son goût. Although algebra can be fun, I adhere even more to the [Principle of Mathematical Laziness](https://stats.stackexchange.com/a/32116/919). Following it tends to reveal the essence of an argument. – whuber Feb 03 '22 at 20:37
  • But what about my INTERNET points @whuber!?!? – bdeonovic Feb 03 '22 at 21:43
  • @whuber Also what if we were not interested in the least squares solution? What if we wanted to minimize the sum of absolute deviations? Does it still hold? – bdeonovic Feb 03 '22 at 21:53
  • Yes: exactly the same argument applies. The model depends on the subspace defined by the column vectors but not on any particular basis you might choose for it. – whuber Feb 03 '22 at 22:23