0

Question: what are the assumptions of a negative binomial regression? Do continuous control variables (the DV and main IV are binary) need to follow a normal distribution? I have been searching online but could not find the answer to this question.

Some of the controls are: ROA (return on assets: a ratio) Sales (very skewed)

Help would be greatly appreciated! I have two stats books that I have been referencing to no avail.

1 Answers1

1

No, they do not have to be normal.

However, it makes sense to think about how variables may logically relate. E.g., if they might be about proportional, then log transforming makes sense (the regression is typically for the log of the expected count - e.g. log expected sales = coefficient times log sales in last year).

Alternatively, think in terms of influence the extreme covariate values in a very skewed distribution will have a lot of influence on the prediction, while getting covariates to close to normal e.g. with a log transformation will reduce that. Whether that is good or bad is another question and depends on whether extreme values are truly as predictive of extreme expected values as the regression equation implies.

In general, it might be good to (e.g. log-) transform a covariate that is a ratio, because ratios have funny distributions with everything below 1 forced into the tiny interval from 0 to 1 and values greater than 1 getting all of 1 to infinity. As a result a halving (0.5) ends up being much smaller an effect than a doubling (2), while these are treated the same on the log-scale.

Björn
  • 21,227
  • 2
  • 26
  • 65