I have observations of four different groups of people. The dependent variable are count data (medical emergencies in the past 2 months). For each group, the dependent variable follows a negative binomial distribution with a maximum at 0. Now I would like to examine, if the factor "group" (meant to be categorical) is a significant factor for the number of reported medical emergencies. What I have done so far: I conducted a glm using glm.nb
in R and examined the result of glm.nb
using anova()
. I would really appreciate, if someone could confirm for me, if this procedure is feasible for a negative binomial distributed response variable and a categorical factor.
My second question: anova()
with the result of the glm.nb
produces the warning: tests made without re-estimating 'theta'
. As far as I understand, theta is a dispersion parameter for the negative binomial distribution. However, the distributions of observations in each of the four groups have very different dispersions. Does R calculate a "mean" theta for all four distributions?