I am getting very different results from glmer (using version 0.999375.37 of lme4) when running in different environments (I can't pinpoint the source of discrepancy, but let's assume some floating point precision difference in the input data). Here is the invocation:
glmer(data=data, n ~ 1 + offset(log(mu)) + (1 | id),
family=poisson(link=log), weight=w,verbose=T)
Sometimes the output is this:
0: 532.77427: 0.610172 -0.0811043
1: 525.91180: 0.531918 -0.102716
2: 511.77820: 0.373325 -0.137516
3: 495.51600: 0.00000 -0.127691
4: 478.25662: 0.00000 -0.100781
5: 474.47031: 0.00000 -0.0806405
6: 474.46819: 0.00000 -0.0811088
7: 474.46819: 0.00000 -0.0811044
8: 474.46819: 0.00000 -0.0811043
9: 474.46819: 0.00000 -0.0811043
Generalized linear mixed model fit by the Laplace approximation
Formula: n ~ 1 + offset(log(mu)) + (1 | id)
Data: data
AIC BIC logLik deviance
478.4681939 488.556259695 -237.23409695 474.4681939
Random effects:
Groups Name Variance Std.Dev.
id (Intercept) 0 0
Number of obs: 1146, groups: id, 160
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.0811043131501 0.0100762538900 -8.04905 8.3436e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
and sometimes it's this:
0: 532.77427: 0.610172 -0.0811043
1: 525.91180: 0.531918 -0.102716
2: 511.77820: 0.373325 -0.137516
3: 495.51540: 0.00000 -0.127690
4: 478.25643: 0.00000 -0.100780
5: 474.47031: 0.00000 -0.0806405
6: 474.46819: 0.00000 -0.0811088
7: 474.46819: 0.00000 -0.0811044
8: 474.46819: 0.00000 -0.0811043
9: 474.46819: 0.00000 -0.0811043
10: 474.46819: 0.00000 -0.0811043
11: 474.46819: 1.29719e-05 0.568362
Generalized linear mixed model fit by the Laplace approximation
Formula: n ~ 1 + offset(log(mu)) + (1 | id)
Data: data
AIC BIC logLik deviance
5700 5710 -2848 5696
Random effects:
Groups Name Variance Std.Dev.
id (Intercept) 1.6827e-10 1.2972e-05
Number of obs: 1146, groups: id, 160
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.568362 0.007282 78.05 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Note that fitting the model without the random effect gives:
Call: glm(formula = n ~ 1 + offset(log(mu)), family = poisson(link = log),
data = data, weights = w)
Coefficients:
(Intercept)
-0.0811043131378
First question: Why is the result unstable? IIUC the model is of the form
n ~ Pois(mu * exp(a + b_id))
where the b_id are drawn from a normal distribution N(0, theta). In the verbose output the third column is a, and the second column is related to theta. As theta approaches zero, the b_id's must go to zero, so I would expect a to converge to the maximum likelihood estimate for the model without random effects. This is exactly what we see, except for iteration #11 in the second outcome.
Second question: How can I automatically detect and avoid this? Is it sufficient to fall back to the simple model (without random effect) when theta is below some threshold?
Note that I'm interested in the estimated values for each level, not the variability (the opposite of the situation discussed in this FAQ). I also want some penalty on differences between the values for different id's, so I don't want to model id as a fixed effect.