1

I hope this is not a duplicate but I cannot find the answer to this question. In a linear model $$Y_i = \beta_1 X_{i,1} + \dots + \beta_{p-1} X_{i,p-1} + \varepsilon_i, \qquad i = 1, \ldots, n$$ with the usual assumptions, is the regression sum of squares, $SSR$, still

$$ SSR = \sum_{i=1}^n (\hat{y_i} - \overline{y})^2 \text{ ?}$$ where $\hat{y_i} = X \hat{\beta}$ is the $i$-th fitted value, $\hat{\beta} = (X^TX)^{-1}X^T y$ and $X$ is the design matrix without the column of ones that it would have if we couldn't assume $\beta_0 = 0$.

Now, I'm asking this question because using the "anova" function in R, you can obtain the $SSR$ by simply adding the corresponding $SSR$'s of each variable (I believe this is called a type I decomposition), but this doesn't match the $SSR$ as calculated above for a model with $\beta_0 = 0$.

Am I missing something or did I just screw up calculating it?

I had a sample of 2 variables, $X_1$ and $X_2$ with $n=11$ observations, as follows: $x_1 = (1,4,9,11,3,8,5,10,2,7,6)^T$, $x_2 = (8,2,-8,-10,6,-6,0,-12,4,-2,-4)^T$ and $y=(6,8,1,0,5,3,2,-4,10,-3,5)^T$.

I introduced them in R as y, x1 and x2. Then using anova(lm(y~0+x1+x2)) I got Sum Sq of 14.279 for x1 and 161.846 for x2. Their sum is 176.154.

However, using the design matrix with $x_1$ and $x_2$ as its columns, I got $\beta = (\beta_1, \beta_2)^T = (0.7211, 0.8089)^T$ (which matches the ones obtained in R) so $SSR = 96.37352$, which is obviously different from the one obtained in R.

user45453
  • 103
  • 5
  • Can you share the R code & output that prompted this question? Did you fit `y~0+x1+...`? Is the intercept *exactly* 0 in both the population & the sample? – gung - Reinstate Monica Jan 03 '15 at 18:00
  • Yes I used y~0+x1+... . I'm not sure exactly what do you mean by the intercept being 0 in the population and sample? I have $n=11$ observations of each variable $X_j$ ($j=1,2$) and $n$ of $Y$. It was from a problem in my class and I arbitrarily decided to use a model with $\beta_0 = 0$. – user45453 Jan 03 '15 at 18:07
  • I'll put the code in a second. – user45453 Jan 03 '15 at 18:09
  • possible duplicate of [When forcing intercept of 0 in linear regression is acceptable/advisable](http://stats.stackexchange.com/questions/102709/when-forcing-intercept-of-0-in-linear-regression-is-acceptable-advisable) – Xi'an Jan 03 '15 at 19:49
  • Would you check my answer [here](https://stats.stackexchange.com/questions/234850/intuition-behind-regression-sum-of-squares/361086#361086)? I guess it would be helpful. The equation of SSR should be different. – KDG Aug 07 '18 at 11:49

0 Answers0