0

I am having difficulty figuring out how to calculate a dispersion parameter to calculate QAICc for a GLMM with a binomial fit.

I have tested for overdispersion using this code:

overdisp_fun <- function(model) {
  ## number of variance parameters in 
  ##   an n-by-n variance-covariance matrix
  vpars <- function(m) {
  nrow(m)*(nrow(m)+1)/2
  }
  model.df <- sum(sapply(VarCorr(model),vpars))+length(fixef(model))
  rdf <- nrow(model.frame(model))-model.df
  rp <- residuals(model,type="pearson")
  Pearson.chisq <- sum(rp^2)
  prat <- Pearson.chisq/rdf
  pval <- pchisq(Pearson.chisq, df=rdf, lower.tail=FALSE)
  c(chisq=Pearson.chisq,ratio=prat,rdf=rdf,p=pval)
}

With this code, I have found some of my candidate models show signs of overdispersion, while some do not. I have tried QAICc in MuMin, but I am having difficulty figuring out how to calculate c-hat properly. Could anyone point me in the right direction?

Also, using just AICc, I found that I have two candidate models, one that shows signs of overdispersion and one that does not. Therefore, how does one average candidate models if one should be assessed by QAICc and the other AICc?

Ben Bolker
  • 34,308
  • 2
  • 93
  • 126
birdnerd_j
  • 11
  • 3

1 Answers1

1

A couple of points:

  • technically, you're not supposed to estimate $\hat c$ for each one of your models. Rather, the estimate of $\hat c$ is done for the most complex (full) model, then applied (without re-estimating it) to compute QAIC(c) for all of the other models. Among other things, the reduced models will always have higher (estimated) $\hat c$, since $\hat c$ is essentially a measure of residual variance. This is reflected in the example given in ?QAIC:
budworm.lg <- glm(SF ~ sex*ldose, data = budworm, family = binomial)
chat <- deviance(budworm.lg) / df.residual(budworm.lg)
dredge(budworm.lg, rank = "QAIC", chat = chat)

here budworm.lg is the full model; we calculate $\hat c$ from that and apply it uniformly (via an argument to dredge) to all the rest of the models. (Note that this example uses $\textrm{deviance}/n$ rather than $\sum r_i^2/n$ as the estimator for $\hat c$ - both are reasonable approximations, there's much discussion of the properties of estimators of overdispersion elsewhere ..)

  • you say that your data are binomial - I'm assuming $N>1$, otherwise (i.e. for ungrouped $N=1$ [Bernoulli]) responses, it's hard/not necessarily sensible to compute overdispersion at all).
Ben Bolker
  • 34,308
  • 2
  • 93
  • 126
  • Thank you for responding. The binomial data are actually age cohorts for a bird species. 0 = second year, 1 = after second year. – birdnerd_j Aug 01 '16 at 15:05