4

I did a toy experiment with linear regression, but getting different results for $R^2$, could any one help me?

library(ggplot2)
fit=lm(price~carat+depth+table+x+y+z-1,data=diamonds)
summary(fit)
sse=crossprod(diamonds$price-fit$fitted.values)
sst=crossprod(diamonds$price-mean(diamonds$price))
1-sse/sst

enter image description here

Haitao Du
  • 32,885
  • 17
  • 118
  • 213

1 Answers1

5

I found the problem.

I am not adding the intercept in the model and without intercept SST is calculated differently, where it should be

crossprod(diamonds$price)

but not

crossprod(diamonds$price-mean(diamonds$price))

I think most books are confusing that just give us the formula to SST as

$$ \|y-\bar y\|^2 $$

but not mention such formula only holds for model with intercept term.

Haitao Du
  • 32,885
  • 17
  • 118
  • 213
  • 2
    You're correct but I believe this effectively duplicates a number of questions already on site. – Glen_b Sep 04 '16 at 06:08