On this page, I am interested in the section “goodness of fit”, which is near the bottom of the page and contains the table of deviance functions.
The author states that the scaled deviance, i.e. $D^*= \frac{D(y, \mu)}{\phi} $ has a limiting $ \chi^2_{n -p} $ distribution, where $ n $ is the number of observation and $ p $ is the number of predicted parameters.
He then goes on to say that in the case $ \phi $ is unknown, it can be predicted as $ \hat{\phi} = \frac{D}{n - p} $. If this is the case, wouldn't the scaled deviance be equal to $ \frac{D(y, \mu)}{\frac{D}{n - p}} = n - p $? I think I am misunderstanding the discrepancy between $ D(y, \mu) $ and $ D $ (without arguments).
A similar thing occurs with the scaled Pearson's chi-squared statistic.
Could someone elaborate on how to calculate scaled deviance in the case that $ \phi $ is unknown and how to proceed with the g.o.f. test?