What makes the canonical link function special in GLMs?

Question

Why is the canonical link function used so frequently with GLMs? What makes it "natural"?

Is there any reason to think that, $Q(\theta _i)$ (where $Q$ is the canonical link function, and $\theta _i$ is the parameter of interest) is better described by a linear combination of predictor variables than some other function of $\theta_i$.

That is, is there any reason to believe that:

$$Q(\theta _i) = \sum_j \beta _j x_{ij}$$

is superior to:

$$f(\theta _i) = \sum_j \beta _j x_{ij}$$ where $Q \neq f$

If not, is there anything else that makes $Q$ better than $f$? The text book I am using just mentions what a canonical link function is, and makes use of it (pretty much exclusively), but does not explain what distinguishes it from any arbitrary link function.

Momo gives a good answer in [this thread](http://stats.stackexchange.com/questions/40876/difference-between-link-function-and-canonical-link-function-for-glm). — P Schnell, Mar 24 '14 at 23:21
Thanks for the link. A few things aren't quite making sense for me. Using Momo's notation: I understand that $\gamma '(\theta_i )= \mu_i $; for the binomial random variable $f(y_i;\theta_i)=(1-\theta_i) exp\{y_i ( log(\frac{\theta_i}{1-\theta_i})) \}$. My understanding was that for the binomial: $\gamma (\theta _i) =log(\frac{\theta_i}{1-\theta_i})$. If this is true, then $\gamma '(\theta_i )= \frac{1}{(\theta_i-1)^{2}} \neq E[Y_i] = \theta_i$. I guess I am off somewhere in my understanding. — dlaser, Mar 24 '14 at 23:59
Correction: $\gamma '(\theta_i )= \frac{1}{\theta_i(1-\theta_i)} \neq E[Y_i] = \theta_i$. — dlaser, Mar 25 '14 at 00:05

What makes the canonical link function special in GLMs?

0 Answers0