Do zero counts need to be adjusted for a likelihood ratio test of poisson/loglinear models?

Question

If there are 0's in the contingency table and we're fitting nested poisson/loglinear models (using R's glm function) for a likelihood ratio test, do we need to adjust the data prior to fitting the glm models (e.g. add 1/2 to all the counts)? Obviously some parameters cannot be estimated without some adjustment, but how does the adjustment/lack of adjustment effect the LR test?

presumably the `glm` routine would bonk if it could not handle zeros. have you tried it? — shabbychef, Jun 06 '11 at 03:16
yes it doesn't crash, but depending on the formula (e.g. in a saturated model), some of the parameters can have effectively infinite standard errors. My question is whether this is a problem when doing a likelihood ratio test. You can still calculate a likelihood even if some parameters aren't estimated, those parameters just won't contribute to the likelihood. What's the standard practice and why? — BR1, Jun 06 '11 at 13:16

Fomite · Answer 1 · 2012-05-02T01:58:45.313

7

One of the powers of regression modeling generally is you can smooth over areas of no data - though as you have noticed, there are occasionally problems in estimating parameters. I would suggest that if you're getting things like infinite standard errors its time to reconsider your modeling approach at bit.

One particular note of caution: There is a difference between "Having no counts" in a particular strata, and it being impossible for there to be counts in that strata. For example, imagine you're working on a study of psychological disorders for the U.S. Navy between say 2000 and 2009, and have binary regression terms for both "Is a Woman" and "Serves on a Submarine". A regression model may be able to estimate effects where both variables = 1 despite having a zero count where both = 1. However that inference wouldn't be valid - such a circumstance is impossible. This problem is called "non-positivity" and is occasionally a problem in highly stratified models.

edited May 02 '12 at 01:58

answered Apr 22 '12 at 01:59

Fomite

21,264
10
78
137

@skyguy94 Oddly enough I don't - I knew that, I had just forgotten to note the use of a retrospective data set >.<. edited="" reflect="" that.="" to=""> – Fomite Apr 22 '12 at 22:53
Re: "A regression model may be able to estimate effects where both variables = 1, **or interactions between the two**" - I don't think that's true. If you have two binary predictors that are never '1' together, then the interaction is constant (it is always '0'), so its effect is not identified. – Macro May 02 '12 at 01:55
@Macro You're right, I'm editing slightly. I was thinking for terms where they're not binary indicators. – Fomite May 02 '12 at 01:58
1

(+1) So, issues with non-plausibility of the case where both=1 aside, the model based estimate would just be the sum of the two marginal effects, which we know can be very misleading in it's own right :) – Macro May 02 '12 at 02:01

Do zero counts need to be adjusted for a likelihood ratio test of poisson/loglinear models?

1 Answers1