1

I'm a newbie, so apologies in advance if this Q is missing any useful detail. I'm trying to test the effect of condition upon the number of times certain behaviors are produced by a group of individuals. Each individual is tested multiple times in different sessions. My data looks like this:

Subject Session Condition Count
1   1   A   0
1   2   A   0
1   3   A   0
1   4   A   0
2   1   A   0
2   2   A   0
2   3   A   0
2   4   A   0
3   1   A   0
3   2   A   0
3   3   A   0
3   4   A   0
4   1   A   0
4   2   A   0
4   3   A   0
4   4   A   0
5   1   A   0
5   2   A   0
5   3   A   0
5   4   A   0
6   1   A   0
6   2   A   0
6   3   A   0
6   4   A   0
1   1   B   0
1   2   B   0
1   3   B   0
1   4   B   0
2   1   B   0
2   2   B   0
2   3   B   0
2   4   B   0
3   1   B   0
3   2   B   1
3   3   B   0
3   4   B   1
4   1   B   1
4   2   B   1
4   3   B   0
4   4   B   0
5   1   B   0
5   2   B   0
5   3   B   0
5   4   B   0
6   1   B   3
6   2   B   0
6   3   B   0
6   4   B   0
1   1   C   0
1   2   C   0
1   3   C   0
1   4   C   0
2   1   C   0
2   2   C   0
2   3   C   0
2   4   C   0
3   1   C   0
3   2   C   0
3   3   C   1
3   4   C   0
4   1   C   0
4   2   C   1
4   3   C   0
4   4   C   1
5   1   C   0
5   2   C   0
5   3   C   0
5   4   C   0
6   1   C   1
6   2   C   0
6   3   C   0
6   4   C   0
1   1   D   0
1   2   D   0
1   3   D   0
1   4   D   0
2   1   D   1
2   2   D   0
2   3   D   0
2   4   D   0
3   1   D   0
3   2   D   0
3   3   D   1
3   4   D   2
4   1   D   0
4   2   D   0
4   3   D   4
4   4   D   3
5   1   D   0
5   2   D   1
5   3   D   0
5   4   D   0
6   1   D   3
6   2   D   3
6   3   D   2
6   4   D   2
1   1   E   1
1   2   E   0
1   3   E   0
1   4   E   0
2   1   E   0
2   2   E   0
2   3   E   0
2   4   E   0
3   1   E   2
3   2   E   0
3   3   E   2
3   4   E   2
4   1   E   2
4   2   E   1
4   3   E   2
4   4   E   3
5   1   E   0
5   2   E   0
5   3   E   2
5   4   E   0
6   1   E   6
6   2   E   1
6   3   E   3
6   4   E   1
1   1   F   0
1   2   F   0
1   3   F   0
1   4   F   0
2   1   F   0
2   2   F   0
2   3   F   0
2   4   F   0
3   1   F   0
3   2   F   0
3   3   F   0
3   4   F   0
4   2   F   2
4   3   F   2
4   4   F   0
4   4   F   0
5   1   F   0
5   2   F   0
5   3   F   0
5   4   F   0
6   1   F   8
6   2   F   10
6   3   F   4
6   4   F   7
1   1   G   0
1   2   G   0
1   3   G   0
1   4   G   0
2   1   G   0
2   2   G   0
2   3   G   0
2   4   G   0
3   1   G   0
3   2   G   0
3   3   G   0
3   4   G   1
4   1   G   1
4   2   G   0
4   3   G   1
4   4   G   1
5   1   G   0
5   2   G   0
5   3   G   0
5   4   G   0
6   1   G   1
6   2   G   1
6   3   G   1
6   4   G   3



  'data.frame': 168 obs. of  4 variables:
 $ Subject  : Factor w/ 6 levels "1","2","3","4",..: 1 1 1 1 2 2 2 2 3 3 ...
 $ Session  : int  1 2 3 4 1 2 3 4 1 2 ...
 $ Condition: Factor w/ 7 levels "A","B","C","D",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ Count    : int  0 0 0 0 0 0 0 0 0 0 ...

There are many zeros (but these are "true" zeros, in that they are expected given the conditions - the individuals are able to produce the behaviors at all times, but do so very infrequently under certain conditions, particularly condition A).

When I used a glmer with poisson family in lme4 the model failed to converge and I was advised to use bglmer, from package blme. This is my code:

m<- bglmer(Count~Condition + Session + (1|Subject),data=dat, 
       family=poisson, fixef.prior = normal(cov = diag(9,8)), control=glmerControl(optimizer="bobyqa"))

Here is the output (with condition E as baseline):

    Cov prior  : Subject ~ wishart(df = 3.5, scale = Inf, posterior.scale = cov, common.scale = TRUE)
Fixef prior: normal(sd = c(3, 3, ...), corr = c(0 ...), common.scale = FALSE)
Prior dev  : 33.0517

Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['bglmerMod']
 Family: poisson  ( log )
Formula: Count ~ Condition + Session + (1 | Subject)
   Data: dat
Control: glmerControl(optimizer = "bobyqa")

     AIC      BIC   logLik deviance df.resid 
   267.1    295.2   -124.6    249.1      159 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-1.4950 -0.3985 -0.2257 -0.0574  4.0170 

Random effects:
 Groups  Name        Variance Std.Dev.
 Subject (Intercept) 3.408    1.846   
Number of obs: 168, groups:  Subject, 6

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept) -0.66015    0.80543  -0.820 0.412429    
ConditionA  -4.09023    1.33829  -3.056 0.002241 ** 
ConditionB  -1.33464    0.41532  -3.213 0.001311 ** 
ConditionC  -1.86472    0.51534  -3.618 0.000296 ***
ConditionD  -0.20941    0.28401  -0.737 0.460919    
ConditionF   0.20791    0.25717   0.808 0.418836    
ConditionG  -0.98801    0.36455  -2.710 0.006725 ** 
Session     -0.07237    0.08819  -0.821 0.411853    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
           (Intr) CndtnA CndtnB CndtnC CndtnD CndtnF CndtnG
ConditionA -0.025                                          
ConditionB -0.098  0.051                                   
ConditionC -0.078  0.041  0.161                            
ConditionD -0.145  0.075  0.298  0.237                     
ConditionF -0.145  0.083  0.329  0.263  0.487              
ConditionG -0.113  0.058  0.231  0.184  0.341  0.377       
Session    -0.244  0.000 -0.002 -0.001 -0.003 -0.068 -0.002

My problem is that the residuals are not normally distributed (I'm having trouble uploading the pic of the QQplot, but here is the Shapiro test of residuals):

Shapiro-Wilk normality test

data:  resid(m)
W = 0.90746, p-value = 8.413e-09

I think this means that the SEs and p-values may be unreliable. I read that I should try to calculate robust SEs, but the "sandwich" package does not seem to work on bglmer models. Does anyone have any suggestions for how I might proceed, by either changing my model or calculating these SEs?

I have tried some other options, such as sqrt transformation and using lmer - this appears to solve the problem of the residuals, but then the model seems to be over-dispersed. I have also tried zero-inflated nb models (using glmmTMB) but they seem to have the same problem with the residuals. Any help would be much appreciated!

Lynn
  • 11
  • 1
  • 2
    Why do you think the residuals should be normal? You have estimated a Poisson model, not a normal model, so there is **no assumption of normal errors** in that model. One solution is to calculate simulated residuals, there is an R package, [DHARMa](https://stats.stackexchange.com/search?q=DHARMa+answers%3A1) but if that supports the package you used I don't know. You could still program the same idea! – kjetil b halvorsen Jan 05 '19 at 10:29
  • Ah, I thought I had read that non-normal residuals would cause a problem even in Poisson models, but must have misunderstood. Thank you for your reply @kjetilbhalvorsen! – Lynn Jan 08 '19 at 05:13

0 Answers0