This may be quite a basic question but I was running a simple linear model and dropping non significant terms until I got to a minimal model. When this was reached, I was obtaining the significance for the explanatory variables that remained also by dropping them one at a time. Now, what I am getting is that one of those variables in my minimal model becomes non-significant after I drop it and I am not quite understanding what is going on. If someone could give me an hint, that would be great (code is below).
So my minimal model was:
m1<-lm(log10(para.ml) ~ treat + prop.r + log(od))
Analysis of Variance Table
Response: log10(para.ml)
Df Sum Sq Mean Sq F value Pr(>F)
treat 3 4.2925 1.43083 30.113 9.181e-09 ***
prop.r 1 1.5419 1.54190 32.451 4.723e-06 ***
log(od) 1 0.5698 0.56981 11.992 0.001796 **
Residuals 27 1.2829 0.04751
So here treat is a factor with 4 levels and both prop.r and log(od) are continuous variables. As you can see, all effects look significant and if I drop prop.r or log(od), model m1 is still preferred. Though the same does not happen if I drop treat:
m2<-update(m1,~.-treat)
anova(m2,m1)
Analysis of Variance Table
Model 1: log10(para.ml) ~ prop.r + log(od)
Model 2: log10(para.ml) ~ treat + prop.r + log(od)
Res.Df RSS Df Sum of Sq F Pr(>F)
1 30 1.6202
2 27 1.2829 3 0.33728 2.3662 0.09308 .
now, if I get the anova table for this last model, I obtain this:
anova(m2)
Analysis of Variance Table
Response: log10(para.ml)
Df Sum Sq Mean Sq F value Pr(>F)
prop.r 1 4.8607 4.8607 90.003 1.537e-10 ***
log(od) 1 1.2062 1.2062 22.335 5.045e-05 ***
Residuals 30 1.6202 0.0540
So, comparing anova(m2) and anova(m1) it looks like most of the variation that was being explained by treat is now explained by prop.r or by log(od). Thus is it just the case that when I have treat in my model it just explains a lot of the variation that could just be explained by the other variables?
Any help appreciated!