0

I have a question about the general linear model:

$$y_i=\beta_0+\beta_1x_{i1}+\beta_2x_{i2}+...+\beta_{p}x_{i,p}+\epsilon_i$$

and this relationship, $SST=SSR+SSE$:

$$\sum_{i=1}^{n}(y_i-\bar{y})^2=\sum_{i=1}^{n}(\hat{y_i}-\bar{y})^2+\sum_{i=1}^{n}(y_i-\hat{y})^2$$

My understanding is that if the linear equation contains $\beta_0$ (the intercept is not zero), then the above equation for the total sum of squares holds. However, what I can't understand is why when $\beta_0$ coefficient disappears (in the case where the intercept is zero), the total sum of squares becomes:

$$\sum_{i=1}^{n}(y_i)^2=\sum_{i=1}^{n}(\hat{y_i})^2+\sum_{i=1}^{n}(y_i-\hat{y})^2$$

To my knowledge, $\bar{y}$ is the mean of the response variables. Why does this term disappear when the intercept becomes zero?

fmtcs
  • 219
  • 7
  • An example where $SST \neq SSR + SSE$, beacuse ofba missing intercept, is seen here https://stats.stackexchange.com/questions/251337/how-can-the-mse-of-predictions-be-greater-than-the-variance-of-the-response-vari/325395#325395 – Sextus Empiricus Oct 01 '21 at 14:57

0 Answers0