I am creating a mixed model in R using the lme4() package. I fit my full model using the lmer() function, with a single random effect.
lme0 = lmer(L50 ~ (1|SITE2) + compost + ZONE + POSITION +
PRED_BIOMASS + CS_DENSITY + RICHNESS + CHLA_RASTER + SST_RASTER, data = gbr)
I am confused on how to proceed with model selection. A stepwise approach, a la dredge(), suggests the following as the 'best' model, one with almost all predictors retained:
head(dredge(lme0))
Fixed term is "(Intercept)"
Global model call: lmer(formula = L50 ~ (1 | SITE2) + compost + ZONE + POSITION +
PRED_BIOMASS + CS_DENSITY + RICHNESS + CHLA_RASTER + SST_RASTER,
data = gbr)
---
Model selection table
(Int) CHL_RAS POS RIC SST_RAS ZON df logLik AICc delta weight
202 -147.40 21.98 + 13.65 + 8 -160.115 340.0 0.00 0.559
But when I apply the lmerTest:step() function, which uses an F test as opposed to an AICc value, the 'best' fit model has only one fixed and one random parameter. I know that the step() function uses maximum liklehihod to fit the random effects, and doesn't seem to compare models on AIC.
s <- lmerTest::step(lme0)
s
Random effects:
Chi.sq Chi.DF elim.num p.value
SITE2 12.96 1 kept 3e-04
Fixed effects:
Sum Sq Mean Sq NumDF DenDF F.value elim.num Pr(>F)
PRED_BIOMASS 0.0090 0.0090 1 20.59 0.0002 1 0.9898
compost 1.3192 1.3192 1 26.23 0.0259 2 0.8735
CS_DENSITY 7.9498 7.9498 1 35.78 0.1634 3 0.6885
SST_RASTER 27.2554 27.2554 1 25.66 0.5660 4 0.4587
ZONE 89.9116 44.9558 2 19.06 0.9115 5 0.4187
RICHNESS 62.6404 62.6404 1 37.94 1.2645 6 0.2679
CHLA_RASTER 153.7626 153.7626 1 42.82 3.0935 7 0.0858
POSITION 3518.9881 3518.9881 1 22.90 63.8783 kept <1e-07
Least squares means:
POSITION Estimate Standard Error DF t-value Lower CI Upper CI p-value
POSITION Mid 1.0 236.56 5.35 23.5 44.24 226 248 <2e-16 ***
POSITION Outer 2.0 187.55 3.00 21.0 62.43 181 194 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Differences of LSMEANS:
Estimate Standard Error DF t-value Lower CI Upper CI p-value
POSITION Mid - Outer 49.0 6.13 22.9 7.99 36.3 61.7 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Final model:
lme4::lmer(formula = L50 ~ (1 | SITE2) + POSITION, data = gbr,
contrasts = list(POSITION = "contr.SAS"))
Please help me understand what the fundamental difference is between the dredge() and step() functions. Why would these two approaches result in two very different 'best' models?
I'll add that the AICc for step()'s best model is higher than the one proposed by dredge():
lme1 = lmer(L50 ~ (1|SITE2) + POSITION, data = gbr)
AICc(lme1)
[1] 359.5486
Let me know if you need more information, Thanks!