0

The deviance residual of a GLM model is defined to be:
$2 (log L_{Saturated Model} - log L_{GLM Model})$

where Saturated model is the model that has as many parameter as the number of data points.

As given a dataset, there are many saturated models.

For example, given the dataset with 5 data points of the variables $Y, X_1, X_2, X_3$ (where $X_i$ are the independent variables) and the saturated model should have the form (assume the log function):

$log(Y) = coef_1 * a_1 + coef_2 * a_2 +...+ coef_5 * a_5$

We can choose different combination of $X_1, X_2, X_3$ for the representation of $a_i$ (for example, $a_1 = X_1/X_2$ or $a_2 = (X_1+X_2)$) until we have 5 different combinations to fill in these $a_i$ in the equation.

However, when I run a GLM regression in R, the result gives a unique deviance residual of the model.

I would like to have 2 questions:

  1. When there are many saturated model, which saturated model is chosen to calculate the deviance residual ?

  2. Is there anything that assures that the log likelihood of all the saturated model the same (in order to have a unique deviance residual) ?

Thank you very much for your help!

  • 1
    Do you want to quote exactly from your source for that definition – seanv507 Jun 08 '21 at 22:33
  • @seanv507: Hi, thanks for your comment. I have been wrong in the sign of the definition of the formula of the deviance. The source is on wikipedia but i see that it is quite everywhere so I think it is not necessary to give a source. Thank you if you have any answer on this question. I'm struggling for quite a while. – InTheSearchForKnowledge Jun 09 '21 at 20:23
  • The saturated model is not being fitted. It is mathematically derived. Its fitted values are equal to the observed response, see https://stats.stackexchange.com/questions/184753/in-a-glm-is-the-log-likelihood-of-the-saturated-model-always-zero for details. – Michael M Jun 09 '21 at 20:59
  • @Michael M: Hi, thanks for your comment. I have read the link you stated many times but can't fully get the idea. By saying "is not being fitted", do you mean that to derive the saturated model, we don't care (and actually don't need) the explanatory variables $X_i$ right ? (i mean we only care about the observed value of explained variable $Y$). Am i right ? Thank you very much for your help! – InTheSearchForKnowledge Jun 09 '21 at 21:05
  • Yes, exactly. You *could* fit a saturated model by using a dummy variable for each observation (am I observation 1? etc). But that would be done just for fun. The software takes a shortcut. – Michael M Jun 09 '21 at 21:08
  • @Michael M: Thank you so much! it helps me a lot. If you have some of your free time, could you please have a look at this question https://stats.stackexchange.com/questions/530064/difference-between-saturated-model-and-usual-model-in-glm I am very appreciated your help! – InTheSearchForKnowledge Jun 09 '21 at 21:13

0 Answers0