1

To the best of my knowledge, the dependent variable (outcome) in a negative binomial regression should ideally represent count data. In R, if the interest is to model rates, I understand the use of +offset(log(population)) in glm.nb or glmmadb .

My question is related to conditions for other covariates. Are incidence/prevalence rates appropriate for additional independent variables? For example, if I'm trying to control for the incidence rate of diabetes per 100,000 within a group, could that be an acceptable independent variable in the negative binomial model?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
  • as long as it has a linear relationship with your dependent variable it should be ok. Also remember to center it – StupidWolf May 15 '20 at 22:10
  • Thanks for the response. What do you mean by centering? – Sahit Menon May 15 '20 at 23:50
  • you minus the mean of the variable off.. that is you center the variable around its mean. check the answer by macro from https://stats.stackexchange.com/questions/29781/when-conducting-multiple-regression-when-should-you-center-your-predictor-varia – StupidWolf May 15 '20 at 23:55
  • 1
    @stupidwolf with negative binomial regression the relationship with the dependent variable would depend on the link function being used – Glen_b May 16 '20 at 11:17
  • 1
    @Glen_b-ReinstateMonica, thanks for pointing that out. SahitMenon, by default glm.nb in R fits it with a log link, so you should check that the log of the rates shows a linear relationship with your condition / covariates – StupidWolf May 17 '20 at 10:46

0 Answers0