3

I'm struggeling to understand the topic of deviance. Let's have two models as follows:

Model 1: glm.nb(Resp ~ Parm1 + Parm2 + Parm3)

Model 2: glm.nb(Resp ~ Parm1 + Parm2)

The only difference between the two models is that the Parm3 was removed for model 2. Why do I get different null deviances? In case of gaussian glm, the null deviance is always equal to:

deviance(glm(Resp~1))

This doesn't seem to apply for glm.nb. But why?

Gavin Simpson
  • 37,567
  • 5
  • 110
  • 153
Airone
  • 31
  • 3

1 Answers1

4

You are incorrect in stating that "The only difference between the two models is that the Parm3 was removed for model 2". You are overlooking the fact that glm.nb() is also estimating the $\theta$ parameter of the Negative Binomial model.

Here is an example from ?glm.nb

quine.nb1 <- glm.nb(Days ~ Sex/(Age + Eth*Lrn), data = quine)
quine.nb2 <- update(quine.nb1, . ~ . + Sex:Age:Lrn)

In this setting, quine.nb2 is your model 1 and quine.nb1 is your model 2, and the null deviances do indeed differ. However, if we fit the model of quine.nb1 using the estimated value of $\theta$ from quine.nb2 we should see the same null deviances. Here I refit using glm() and the negative.binomial() family function so I am certain the same code is used for fitting and I can fix $\theta$ at a "known" value.

theta <- quine.nb2$theta
f1 <- formula(quine.nb1)
f2 <- formula(quine.nb2)

m1 <- glm(f1, data = quine, family = negative.binomial(theta = theta))
m2 <- glm(f2, data = quine, family = negative.binomial(theta = theta))

Now we extract the Null deviance from the two models

> m1$null.deviance # $ ignore: the code & mathjax is messing up again...
[1] 244.944 
> m2$null.deviance 
[1] 244.944

and they are the same.

The key point is that with glm.nb() the reported null deviance is conditional upon the estimated value of $\theta$. You have to be careful that you are comparing like with like.

Gavin Simpson
  • 37,567
  • 5
  • 110
  • 153
  • Hello Mr. Simpson, thanks for your quick and detailed answer! I already thought it has soemething to do with theta, but I was not sure. I'm trying to calculate the percentage of deviance explained by this Parm3. Would it be OK to fix theta and compare the difference of the residual deviances of model1 and model2 with the Null deviance? Best wishes, Airone. – Airone Oct 31 '14 at 07:52
  • @Airone These things get tricky and I am not sure of the correct answer. Sounds like that would be a good question for [stats.se], so I suggest you ask another question, linking to this one for some of the detail. – Gavin Simpson Oct 31 '14 at 16:54
  • I did as you suggested and explained the complete case in [here](http://stats.stackexchange.com/questions/122459/relative-variable-importance-for-glm-glm-nb-percentage-of-deviance-explained). Thanks! Airone – Airone Nov 03 '14 at 09:06