I'm trying to understand the reason why anova(f1, f2, f3) and anova(f1, f2) gives me different result while anova(f1, f2, f3) and anova(f2, f3) gives me the same.
Here's the code:
> data(swiss)
> fit1 <- lm(Fertility ~ Agriculture, data=swiss)
> fit3 <- lm(Fertility ~ Agriculture + Examination + Education, data=swiss)
> fit5 <- lm(Fertility ~ Agriculture + Examination + Education + Catholic + Infant.Mortality, data=swiss)
> class(fit1)
[1] "lm"
> class(fit3)
[1] "lm"
> class(fit5)
[1] "lm"
> anova(fit1, fit3, fit5)
Analysis of Variance Table
Model 1: Fertility ~ Agriculture
Model 2: Fertility ~ Agriculture + Examination + Education
Model 3: Fertility ~ Agriculture + Examination + Education + Catholic +
Infant.Mortality
Res.Df RSS Df Sum of Sq F Pr(>F)
1 45 6283.1
2 43 3180.9 2 3102.2 30.211 8.638e-09 ***
3 41 2105.0 2 1075.9 10.477 0.0002111 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> anova(fit1, fit3)
Analysis of Variance Table
Model 1: Fertility ~ Agriculture
Model 2: Fertility ~ Agriculture + Examination + Education
Res.Df RSS Df Sum of Sq F Pr(>F)
1 45 6283.1
2 43 3180.9 2 3102.2 20.968 4.407e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> anova(fit3, fit5)
Analysis of Variance Table
Model 1: Fertility ~ Agriculture + Examination + Education
Model 2: Fertility ~ Agriculture + Examination + Education + Catholic +
Infant.Mortality
Res.Df RSS Df Sum of Sq F Pr(>F)
1 43 3180.9
2 41 2105.0 2 1075.9 10.477 0.0002111 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
To me, this is read like this:
anova(fit1, fit3, fit5)
shows me P value 8.638e-09 for the comparison of f1 and f3.anova(fit1, fit3)
shows me 4.407e-7 for the same comparison. And this doesn't make sense!anova(fit1, fit3, fit5)
andanova(fit3, fit5)
shows the same P value for the comparison of fit3 and fit5 and it's 0.0002111.
Probably I missed something. What is it?