How to check overdispersion of binomial GLMMs, lme4 package

Question

I have the following model

fit1 <- glmer(Res~FA+FB+FC+(1|fsite), family=binomial(), data=DATA)

the result of summary() is:

summary(fit1)
Generalized linear mixed model fit by maximum likelihood 
 (Laplace Approximation) ['glmerMod']
 Family: binomial  ( logit )
Formula: Res ~ FA + FB + FC + (1 | fsite)
   Data: DATA

     AIC      BIC   logLik deviance df.resid 
   202.3    229.9    -92.1    184.3      150 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.1768 -0.6167 -0.4967  0.6815  2.0132 

Random effects:
 Groups Name        Variance Std.Dev.
 fsite  (Intercept) 0        0       
Number of obs: 159, groups:  fsite, 28

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  1.55573    0.55830   2.787 0.005327 ** 
FA2         -0.11914    0.37344  -0.319 0.749692    
FB2         -1.38652    0.39967  -3.469 0.000522 ***
FC2         -0.14976    0.61984  -0.242 0.809076    
FC3         -0.06794    0.63171  -0.108 0.914350    
FC4         -1.20114    0.61670  -1.948 0.051452 .  
FC5         -1.44951    0.62817  -2.308 0.021025 *  
FC6         -1.13590    0.65427  -1.736 0.082538 .  
---
Signif. codes:  0 ?**?0.001 ?*?0.01 ??0.05 ??0.1 ??1

Correlation of Fixed Effects:
        (Intr) fspcs2    FB2    FC2    FC3    FC4    FC5 
FA2     -0.466                                          
FB2     -0.456  0.169                                   
FC2     -0.572 -0.021  0.017                            
FC3     -0.596  0.050  0.036  0.506                     
FC4     -0.582 -0.005  0.020  0.519  0.509              
FC5     -0.558  0.019 -0.038  0.508  0.500  0.511       
FC6     -0.391 -0.101 -0.288  0.485  0.467  0.486  0.492

Why are the variance and Std.Dev of the random effects zero?
How do I check for overdispersion in this model?
What should do if there is overdispersion?

What kind of numbers are there in `res`? Is it a vector of `1`a & `0`s, or is each entry a count of 'successes' out of multiple trials? — gung - Reinstate Monica, Aug 19 '16 at 14:53
@RobertLong, note that the `[r]` tag provides syntax highlighting. That's 1 of the reasons I added it. I'm not sure if `[mixed-model]` adds much, given `[lme4]` & `[glmer]`. Maybe I'll try `[glmm]` for both `[mixed-model]` & `[glmer]`, & put `[r]` back for the syntax highlighting. — gung - Reinstate Monica, Aug 19 '16 at 17:03
@gung ahh - OK I didn't know that, sorry. Personally I'd like to see a search of the [mixed-model] tag find a question like this so maybe one of [glmm] and [lme4] is redundant ? Hmmm — Robert Long, Aug 19 '16 at 17:43
@RobertLong, I think Ben might prefer the [lme4] tag. I just switched [mm] for [glmm]. Hopefully, that will do it. — gung - Reinstate Monica, Aug 19 '16 at 20:55

score 5 · Accepted Answer · edited Apr 24 '18 at 16:23

Why are the variance and Std.Dev of the random effects zero?

Because the marginal variance among sites in your data is less than would be expected from a binomial variable; the variance can't be negative, so it's estimated as zero. The GLMM FAQ discusses this.

How do I check for overdispersion in this model?

This looks like a binary (not just binomial) regression, i.e. your responses are 0/1 (if you had "m out of N" responses where N>1, you either need a two-column response variable of (successes,failures), or you need to specify the weights argument). Therefore, overdispersion is not identifiable (e.g., see here or here). tl;dr, you don't need to worry about it. If you did have N>1, the previously linked GLMM FAQ gives some guidance ... the overdisp_fun from there can be used; depending on your philosophy of model-building you can use a hypothesis test (e.g. $p<0.05$?) or a rule of thumb (e.g. overdispersion factor > 1.1) to decide whether you should worry about it.

What should do if there is overdispersion?

See previous answer (i.e., don't worry about if you have Bernoulli responses; otherwise, see the FAQ or elsewhere for strategies).

What do I do if the data are unbalanced? (asked in comments)

Unless you have data that are severely unbalanced, or unless the data are structured that you have complete separation (all zeros or all ones for some combinations of predictor variables), GLMMs will handle unbalanced data fine; there's no need for manual adjustment.

Many tanks for the answer. I am a little confused. In this example, the response variable is 0-1 data. Therefore, the overdispersion doesn't apply here. However, how do I deal with data with unequal number of zeors and ones? How can I evaluate the unequal? If the 80% vs 20% is unequal, how about 60% vs 40? For proportional data, can I use the ‘overdisp_fun’ to test overdispersion? Can I say that there is overdispersion when ‘p’ value is smaller than 0.05? — cww, Aug 20 '16 at 15:53

How to check overdispersion of binomial GLMMs, lme4 package

1 Answers1