How did they simplify normal equations for OLS in linear regression?

Question

How did they go from (1) to (2): \begin{align*} S_{xx} &= \sum(X_i - \bar{X})^2 \tag1 \\ &= \sum(X_i - \bar{X}) X_i \tag2 \\ &= \sum X_i^2 - \left(\sum{X_i}\right)^2/n \\ &= \sum X_i^2 - n \bar{X} \end{align*} In (2), are they simply saying that $(X_i - \bar{X}) = X_i$? Why is that so?

It is also seen here in OLS equation: $$b_1 = \frac{\sum X_i Y_i - \left[\left(\sum X_i \right) \left(\sum Y_i \right)\right]/n}{\sum X_i^2 - \left( \sum X_i\right)^2 /n} = \frac{ \sum\left(X_i -\bar{X}\right) \left(Y_i - \bar{Y}\right)}{\sum \left(X_i - \bar{X} \right)^2}$$

The technique is used again in the denominator, when they go from middle equation to the right. Why is it?

Essentially it's just expanding the quadratic, simplifying, collecting & cancelling terms. See here for example: https://stats.stackexchange.com/questions/256179/method-of-moments-applied-to-a-normal-distribution -- the answer covers essentially this in the middle (apart from some minor simplifications to match the derivation here). [I'm debating whether this is different enough in detail to stand on its own rather than close as a duplicate.] — Glen_b, May 08 '17 at 15:01
As everyone is implying and saying, it's a simple algebraic consequence of the definition of $\bar{X} = \frac{1}{n} \sum_{i=1}^n X_i$ — Matthew Gunn, May 08 '17 at 15:02
@Glen Because this exact question (a) has appeared several dozen times but (b) is almost impossible to search for, I think it's well worth while closing the duplicates and pointing them at some good, canonical answers: we can hope that will help in future searches. — whuber, May 08 '17 at 15:07
I just changed the title there a few minutes ago in the hope of making it easier to find. — Glen_b, May 08 '17 at 15:12
No No No, Scott's answer is MUCH better! Simple and elegant! The other question does not answer my question, albeit similar, they never shown the identity at the end, which is the most important thing! — user13985, May 08 '17 at 15:44
@MatthewGunn What is the intuition behind $$ \sum(X_i - \bar{X})^2 = \sum(X_i - \bar{X}) X_i $$ that we can just see $(X_i - \bar{X})$ as $X_i$? — user13985, May 08 '17 at 17:51
@user13985 No because that is wrong. $a - b$ is not the same as $a$. — Matthew Gunn, May 08 '17 at 18:08
@MatthewGunn Math works out to say that $(X_i - \bar{X})$ is replaced with $X_i$, how am I suppose to think of this meaningfully? — user13985, May 08 '17 at 18:33
[This question and answer gives a geometric interpretation](https://stats.stackexchange.com/questions/254357/intuition-geometric-or-other-of-varx-ex2-ex2). — Matthew Gunn, May 08 '17 at 19:43
I am not sure why you keep on relating this to sample variance. My question is actually not related to variance, say you have five numbers: [1, 2, 4, 7, 11]. The mean would be 5. The residual = x - xbar would be the vector -[4, -3, -1, 2, 6]. We take two cases to compare. Case 1: (x - xbar) * x = [-4, -6, -4, 14, 66]. Case 2: (x-xbar) * (x - xbar) = [16, 9, 1, 4, 36]. Now, sum of [-4, -6, -4, 14, 66] = **66**. And sum of [16, 9, 1, 4, 36] = **66**. I just want to know why in the world they are the same, not mathematically. I am going towards the direction of centering and scaling. — user13985, May 08 '17 at 19:54
@MatthewGunn I think my intuition for this $$ \sum(X_i - \bar{X})(X_i - \bar{X}) = \sum(X_i - \bar{X}) X_i $$ is that, replacing $$(X_i - \bar{X})$$ with $$X_i$$ is simply centering the data $X_i$. Now, I am wondering why centering doesn't change the result. — user13985, May 08 '17 at 20:04
But if centering the data doesn't affect its outcome, then would this be true? $$\sum X_i (X_i - \bar{X}) \tag1$$ be $$\sum X_i X_i \tag2$$ because looking at the centering on the second term of (1), this is also centering. I don't think so! So, I suppose centering only works in certain conditions! But, when? — user13985, May 08 '17 at 20:09

Him · Accepted Answer · 2017-05-08T15:01:57.837

4

$$ \sum (X_i - \bar{X})^2 $$ $$ \sum(X_i - \bar{X})(X_i - \bar{X}) $$ $$ \sum (X_i^2 - 2\bar{X}X_i + \bar{X}^2) $$ $$ \sum \left[(X_i^2 - \bar{X}X_i) + (\bar{X}^2 - \bar{X}X_i)\right] $$ $$ \sum \left[(X_i - \bar{X})X_i + (\bar{X} - X_i)\bar{X}\right] $$

We can "distribute" the $\Sigma$ over those two summands. The second one turns out to be zero

$$ \sum (\bar{X} - X_i)\bar{X} $$ $$ \bar{X} \sum (\bar{X} - X_i) $$ $$ \bar{X} (\sum \bar{X} - \sum X_i) $$ $$ \bar{X} (n\bar{X} - n\bar{X}) $$ $$ 0 $$

edited May 08 '17 at 15:01

answered May 08 '17 at 14:51

Him

2,027
10
25

Is this correct? $$\sum X_i^2 - 2 \sum X_i \bar{X} + \sum \bar{X}^2 = \sum \left[X_i^2 - 2 X_i \bar{X} + \bar{X}^2 \right]$$ I think that's the part that got me stuck! – user13985 May 08 '17 at 16:29
Between steps 2 and 3 is just the distributive property (FOIL, recall from algebra). Between steps 3 and 4 is: Take those 2 $\bar{X}X_i$ that you have, and give one to the first term and one to the second term. The parenthesis are just there to accentuate the grouping. – Him May 08 '17 at 19:53
.... The equality in your comment is correct, yes, but you shouldn't need that specific result in the proof. – Him May 08 '17 at 19:54
Can you explain why $$ \sum(X_i - \bar{X})(X_i - \bar{X}) = \sum(X_i - \bar{X}) X_i $$ What is the intuition behind replacing $(X_i - \bar{X})$ with $X_i$ but still have the same result? – user13985 May 08 '17 at 20:01

Matthew Gunn · Answer 2 · 2017-05-08T18:52:59.730

1

Simple numerical example

Let $X_1 = 1$, $X_2 = 3$, $X_3 = 8$

Then $\bar{X} = \frac{1}{3} \left( 1 + 3 + 8\right)$ = 4

It is not at all correct to say $(X_1 - \bar{X}) = X_1$ which would be equivalent to saying that (1 - 4) = 1

The point is that

$$ \sum_i \bar{X} \left( X_i - \bar{X} \right) = 0$$

because $\sum_i X_i = n \bar{X}$. In this example $1 + 3 + 8 = 3 \cdot 4 = 12$

In this example, the statement $\sum_i \bar{X} \left( X_i - \bar{X} \right) = 0$ would be:

$$4\left( 1 - 4 \right) + 4 \left( 3 - 4 \right) + 4 \left(8 - 4\right) = 0$$

If you factor our $\bar{X}$:

$$ \bar{X} \left[ \sum_i \left( X_i - \bar{X} \right) \right] = 4 \left[ ( 1 - 4) + (3 - 4) + (8 - 4) \right] = 0$$

edited May 08 '17 at 18:52

answered May 08 '17 at 14:56

Matthew Gunn

20,541
1
47
85

Yes, this makes sense. $(X_i - \bar{X})$ is essentially residuals around the mean, which sums to zero, and $\bar{X}$ is a constant. But, I was asking **why** this $$\sum (X_i - \bar{X})^2 $$ results in this $$\sum (X_i - \bar{X})X_i $$ **in its interpretation**. I have understood how the math works out. – user13985 May 08 '17 at 19:24
@user13985 This is the ubiquitous identity that $\operatorname{Var}(X) = \operatorname{E}[X^2] - \operatorname{E}[X]^2$. [See this question](https://stats.stackexchange.com/questions/254357/intuition-geometric-or-other-of-varx-ex2-ex2) – Matthew Gunn May 08 '17 at 19:35
I even wrote a short R code: $$\text{x – user13985 May 08 '17 at 19:44

How did they simplify normal equations for OLS in linear regression?

2 Answers2

Simple numerical example