Fitting a binary logistic GLMM here, with ungrouped data (all responses either 0 or 1).
It says in this thread and in the documentation of anova.merMod
that the significance of the random intercept cannot be tested with a LRT against the fixed model if the random-intercept model was fit using adaptive quadrature with more than 1 quadrature point. More specifically, the documentation says that the output of logLik()
for a GLMM is 'only proportional to the log probability density of the response variable'. Here's a silly question coming from a non-statistician:
Why not just calculate the log-likelihood for the random-intercept model manually, using the response vector and the fitted values, and compare that to the corresponding log-likelihood of the fixed-effects model? Like so:
LL.mixed <- sum(dbinom(data$response, size = 1, prob = fitted(mixed.model), log = TRUE))
LL.fixed <- sum(dbinom(data$response, size = 1, prob = fitted(fixed.model), log = TRUE))
pchisq((-2*LL.fixed)-(-2*LL.mixed), df = 1, lower.tail = FALSE) / 2
It seems straightforward enough, since the model's log-likelihood is just a sum over the log probability densities of every data point given their respective fitted values, no? Why should the number of quadrature points used in estimating the random effect matter at all? It doesn't enter in any way into this simple calculation of the model's log likelihood, so what's the complication?
I now notice that while the above calculation using sum()
and dbinom()
replicates exactly the output of logLik()
for any logistic GLM, these two outputs do not match for GLMMs, not even if the number of quadrature points is 1 (using glmer()
). What in the world is going on? Is not the log-likelihood of any logistic model, mixed or fixed, based quite simply on the responses, the fitted values, and the binomial PMF? Is it me who is f**ing up, or is it logLik()
?