2

In an ANOVA framework, is it possible to test the hypothesis that $\beta_0 = 0$ given a "full" model of $\hat y_i = \beta_0 + \beta_1 x_i$? Or does the "full" model not contain the "small" model $\hat y_i = \beta_1 x_i$ (and are there bias-related issues)?

Glen_b
  • 257,508
  • 32
  • 553
  • 939
rrrrr
  • 381
  • 3
  • 14
  • 1
    It's not quite clear what you are asking about when you say "are there bias related issues" ... any time you have a smaller model being compared to a larger one, if the larger model is the correct one, the estimates in the smaller model are biased. If that's not what you're asking about you should clarify. – Glen_b Oct 24 '16 at 05:19
  • Thanks -- I meant bias in the sense of errors not summing to zero, ans I guess my question was more along the lines of: is the soace spanned by the columns of the "small" model contained in the space of the columns spanned by the "full" model? Somehow I want to say "no" because the errors in the "small" model do not sum to zero, right? – rrrrr Oct 24 '16 at 14:35
  • 1
    You mean the residuals? Sure, rather than their mean being zero you instead have a weighted mean of them being zero. But in relation to estimation, bias is about $E(T-\theta)$ for some estimator $T$ of some unknown quantity $\theta$. I'm still not entirely sure I see the issue. You still have $E(\hat{\beta}_1)=\beta_1$ for example. – Glen_b Oct 24 '16 at 21:11
  • Sorry, I was referring to the sum of the residuals being nonzero, which you pointed out. Ok, so just to confirm, is it correct to think about $span(\text{small model}) \subset span(\text{full model})$ in the sense that the small model is a subspace of the columns of the larger model's design matrix? – rrrrr Oct 25 '16 at 14:56
  • If you're asking "what makes models nested" you could try some of the questions on site -- e.g. maybe this one could be of some help: http://stats.stackexchange.com/questions/4717/what-is-the-difference-between-a-nested-and-a-non-nested-model – Glen_b Oct 26 '16 at 06:15

1 Answers1

1

Yes, you can test whether the intercept is 0 via ANOVA, or indeed by looking at the regression coefficient's t-value.

  1. An example (in R) using ANOVA:

     full <- lm(dist~speed,cars)
     noint <- lm(dist~0+speed,cars)
     anova(noint,full)
    Analysis of Variance Table
    
    Model 1: dist ~ 0 + speed
    Model 2: dist ~ speed
      Res.Df   RSS Df Sum of Sq      F  Pr(>F)  
    1     49 12954                              
    2     48 11354  1    1600.3 6.7655 0.01232   <-----
    

.... the p-value for the intercept term is 0.0123

  1. Using the t-ratio for the intercept term:

     summary(lm(dist~speed,cars))
    
    Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
    (Intercept) -17.5791     6.7584  -2.601   0.0123   <---------
    speed         3.9324     0.4155   9.464 1.49e-12 
    --
    Residual standard error: 15.38 on 48 degrees of freedom
    Multiple R-squared:  0.6511,    Adjusted R-squared:  0.6438 
    F-statistic: 89.57 on 1 and 48 DF,  p-value: 1.49e-12
    

Again the p-value for the intercept term is 0.0123

Glen_b
  • 257,508
  • 32
  • 553
  • 939