Differences between summary and anova function for multilevel (lmer) model

Question

I've been working on some multilevel models using the lmer function in R and have been playing with some different ways of testing the significance of the fixed effects of my model. I have found that the summary() function and the anova() function from lmertest yield different results. My understanding is that the anova function should test whether any of my groups differs from the intercept, whereas the summary function displays the significance of the deviance of individual groups from the intercept. However, I am finding that the anova function does not return a significant interaction effect Origin:Fert, whereas the summary function reports that OriginCO:FertUnfertilized is significant.

What gives? Am I missing something here?

> mod_rs_Origin_lmer_nelder=lmer(rs_feedback ~ 
+                                    Date_of_Emergence  + Origin*Fert +  (1 | Soil_ID), data=data,
+                                  control = lmerControl(optimizer ="Nelder_Mead"))
> anova(mod_rs_Origin_lmer_nelder, type=2)
Analysis of Variance Table of type II  with  Satterthwaite 
approximation for degrees of freedom
                  Sum Sq Mean Sq NumDF DenDF F.value   Pr(>F)   
Date_of_Emergence 1.3155 1.31552     1   148  4.6081 0.033450 * 
Origin            2.6584 0.66461     4   148  2.3281 0.058853 . 
Fert              2.9384 2.93838     1   148 10.2928 0.001637 **
Origin:Fert       2.1927 0.54817     4   148  1.9202 0.110035   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> summary(mod_rs_Origin_lmer_nelder)
Linear mixed model fit by REML t-tests use Satterthwaite approximations to degrees of  freedom
[lmerMod]
Formula: rs_feedback ~ Date_of_Emergence + Origin * Fert + (1 | Soil_ID)
   Data: data
Control: lmerControl(optimizer = "Nelder_Mead")

REML criterion at convergence: 272

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-1.6043 -0.6106 -0.2517  0.4541  4.7311 

Random effects:
 Groups   Name        Variance Std.Dev.
 Soil_ID  (Intercept) 0.0000   0.0000  
 Residual             0.2855   0.5343  
Number of obs: 159, groups:  Soil_ID, 4

Fixed effects:
                            Estimate Std. Error         df t value Pr(>|t|)   
(Intercept)                 0.225550   0.134766 148.000000   1.674  0.09631 . 
Date_of_Emergence          -0.007822   0.003644 148.000000  -2.147  0.03345 * 
OriginCO                   -0.114934   0.180923 148.000000  -0.635  0.52624   
OriginF                    -0.197089   0.190659 148.000000  -1.034  0.30295   
OriginQM                   -0.027523   0.187279 148.000000  -0.147  0.88336   
OriginQR                   -0.030363   0.178115 148.000000  -0.170  0.86487   
FertUnfertilized            0.524999   0.186802 148.000000   2.810  0.00562 **
OriginCO:FertUnfertilized  -0.577240   0.261952 148.000000  -2.204  0.02910 * 
OriginF:FertUnfertilized    0.043589   0.281231 148.000000   0.155  0.87704   
OriginQM:FertUnfertilized  -0.421518   0.270105 148.000000  -1.561  0.12076   
OriginQR:FertUnfertilized  -0.248637   0.258104 148.000000  -0.963  0.33696

I believe this is on-topic here @MichaelChernick. This Q is entirely statistical and not about software at all. — amoeba, Apr 03 '17 at 07:28
If that were the case @amoeba why is there mostly code in the question? It looks to me that the OP is asking about the lmer package in R. — Michael R. Chernick, Apr 03 '17 at 11:55
Your understanding of the difference is not correct I fear as you are in one case testing one at a time and in the other testing all simultaneously. — mdewey, Apr 03 '17 at 12:38
@mdewey yes that is my understanding of the difference between the two functions. Wouldn't it then follow that if I find at least 1 significant difference using the summary function, then I should find a significant effect in my anova table? Is that an incorrect assumption? — Jake, Apr 04 '17 at 04:14
@Jake not necessarily. Since you are doing 4 tests at once, there is an inflated risk of a Type I Error. — Mark White, Apr 21 '17 at 20:06
@MarkWhite So, when I do an anova, is it doing some sort of correction for false positives for the multiple tests that are being done? — Jake, Apr 23 '17 at 19:37
@Jake in short, yes. I'm not sure on the technical details of how the omnibus test does this, but it looks like the issue has been addressed here before: https://stats.stackexchange.com/questions/59910/relation-between-omnibus-test-and-multiple-comparison — Mark White, Apr 23 '17 at 20:24

score 4 · Answer 1 · answered Apr 21 '17 at 20:03

It looks like there are 2 levels of Fert and 5 levels of Origin, correct?

I believe the anova() output shows the omnibus test, while the summary() function shows regression coefficients that represent specific contrasts, which are defined by the reference group (i.e., whatever level is first).

Origin:Fert is showing the omnibus interaction term significance. OriginCO:FertUnfertilized tests if the difference between the reference Origin and CO Origin depends on whether or not it is Fertilized or Unfertilized.

The authors of the lme4 package didn't want to include dfs or p-values into their output, because estimation of these dfs is tricky for multilevel models. Instead, they suggested comparing nested models. If you are looking for the omnibus effect of the interaction, for example, I would compare a model with it and a model without it:

mod0 <- lmer(rs_feedback ~ Date_of_Emergence  + Origin + Fert +  (1 | Soil_ID),
             data=data, control = lmerControl(optimizer ="Nelder_Mead"))

mod1 <- lmer(rs_feedback ~ Date_of_Emergence  + Origin*Fert +  (1 | Soil_ID),
             data=data, control = lmerControl(optimizer ="Nelder_Mead"))

anova(mod1, mod0, refit=FALSE)

Differences between summary and anova function for multilevel (lmer) model

1 Answers1