1

Here is an exemple of fitting an ANCOVA model without intercept in R:

> k <- 40
> x <- 1:k
> sigma <- 30
> y1 <- rnorm(k, 1*x, sigma)
> y2 <- rnorm(k, 2*x, sigma)
> y3 <- rnorm(k, 3*x, sigma)
> x <- rep(x, 3)
> y <- c(y1,y2,y3)
> group <- gl(3,k)
> summary(lm(y~0+x+group:x))

Call:
lm(formula = y ~ 0 + x + group:x)

Residuals:
    Min      1Q  Median      3Q     Max 
-71.110 -20.389  -2.496  15.913  76.937 

Coefficients: (1 not defined because of singularities)
         Estimate Std. Error t value Pr(>|t|)    
x          3.1933     0.1989  16.054  < 2e-16 ***
x:group1  -2.0444     0.2813  -7.268 4.44e-11 ***
x:group2  -1.2994     0.2813  -4.619 9.97e-06 ***
x:group3       NA         NA      NA       NA    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 29.6 on 117 degrees of freedom
Multiple R-squared: 0.7654,     Adjusted R-squared: 0.7594 
F-statistic: 127.3 on 3 and 117 DF,  p-value: < 2.2e-16 

The parameter "x:group3" is not estimated. Of course this parameter actually is the parameter "x". The estimates seem to be correct, but are the p-values correct ? And is there another to fit this model without obtaining this problem ?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Stéphane Laurent
  • 17,425
  • 5
  • 59
  • 101
  • When you have a factor variable, omitting the intercept does not really change the model, it only changes the parametrization. Also see https://stats.stackexchange.com/questions/7948/when-is-it-ok-to-remove-the-intercept-in-a-linear-regression-model – kjetil b halvorsen Nov 26 '18 at 09:45

1 Answers1

5

It is almost never a good idea to suppress the intercept. Although it can be OK under some circumstances, and in this case the model does accurately reflect the data generating process because of the way it was set up, in general I don't see that much is gained here. If you had included intercepts, their sampling distributions would have been centered on 0, and the slope parameters would have been unbiased. On the other hand, if the data generating process included intercepts that were anything other than exactly 0, the slope parameters would have been biased instead. Moreover, you would only lose 3 extra degrees of freedom that way, which you certainly have to spare. (Sorry about the rant.)

As for your specific questions, there may be an alternative approach with R that returns the information organized differently, but I don't know what it would be, and it wouldn't make any substantive difference. The p-values are fine.

One last note, because you included the interactions, this is technically not an ANCOVA. ANCOVA is a special case of multiple regression with both categorical and continuous covariates that assumes parallel slopes.

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
  • Thanks for your advice. But my real dataset is precisely a situation where it is sensible to drop the intercept. I am interested in the ratio y/x, I want to assess whether the group has an effect on this ratio. I have two alternative ideas: 1) ANOVA with the response y/x ; 2) ANOVA on the residuals of the regression without intercept. Do you have an opinion about these ideas ? – Stéphane Laurent May 04 '12 at 15:18
  • That sounds like a separate question to me ! – onestop May 04 '12 at 15:54
  • The results that R is returning may be funny looking, but they're not wrong, and you know how to interpret them--you aren't making an (interpretive) error, either. Moreover, the p-values are correct, so I don't see a big issue here. In your specific situation, it may be fine to suppress the intercept, I was just stating that for the record. Regarding using a response variable that's a ratio, my one recommendation would be to use the difference of two logs (ie, ln(y)-ln(x)). Otherwise, I think you'll be OK. – gung - Reinstate Monica May 04 '12 at 18:40