Mean-variance relationship in the quasi-likelihood

Question

I have some questions regarding the quasi-likelihood model of GLM:

I understand that one reason to use quasi-likelihood in GLM is over-dispersion. This seems to justify using the quasi-Poisson, or quasi-binomial, with the same mean-variance function, $V(\mu)$, but allowing for estimation of the dispersion parameter, $\phi$. But what is the motivation for using a completely new mean-variance relationship? Is it just pure experimentation?
How do you know that this new quasi-model is good or bad?
The new quasi model doesn't correspond to any known exponential family distribution. But does some (not known, i.e., not named) distribution of it exists? Dunn and Smyth discuss something called the "quasi-probability function" (8.10) (integrating the score function, and supposably taking the exponent of that).

I wouldn't say quasilikelihood models have the *same* mean-variance relationship as their maximal likelihood counterparts. Rather, it's *proportional up to a constant* - i.e. the dispersion parameter. $V_{quasi}(\mu) = \phi V_{MLE}(\mu)$. — AdamO, Jul 19 '21 at 16:33

score 1 · Answer 1 · answered Jul 19 '21 at 16:20

You switch to using the quasi-likelihood because you can see (in your data) or suspect (from prior knowledge) that the standard variance function won't be appropriate. Note that there isn't really anything wrong or bad about using the quasi-likelihood, so if it turns out not to have been needed, it isn't a problem (cf., How to choose between t-test or non-parametric test e.g. Wilcoxon in small samples, for an analogous case).
With respect to the mean function, you do the same things you would do if you weren't using the quasi-likelihood (e.g., look at plots, try a higher-degree polynomial, etc.). With respect to the variance function, this approach assumes the variance is a constant multiple of the default variance function (e.g., if you had count response data where there was high variance at low mean counts and low variance at high mean counts, that wouldn't be covered by this approach). So you want to check that. This can again be done using typical methods (e.g., plots, conditional descriptives, fit a new model to the absolute value of the raw residuals, etc.). If it isn't good, you can try other strategies (see: Alternatives to one-way ANOVA for heteroskedastic data for some examples).
This is hard to answer. If your data aren't distributed as one of the (currently) named distributions, they could be anything. They may well be distributed by some yet-to-be discovered distribution... or not.

1 Answers1