Robustness of GLM to link function

Question

When I first learned about GLMs I was taught that the link function wasn't that important so long as the domain and codomain match up. For instance, in a logistic regression we certainly need $g: (0,1) \rightarrow \mathbb R$ but beyond that we don't much care. I also learned that the canonical link makes the math easier but doesn't much matter beyond that.

More recently, though, I've encountered cases where the link function is estimated. If my teachers were right in that the results are generally robust to the link function, why would we ever want to go to all the effort of estimating it?

Can you give examples of places where the link function is estimated? It will help people to answer and understand this question. — Sycorax, Apr 27 '16 at 19:23
I assume the OP is referring to cases where different links are tried & the one that optimizes a goodness of fit metric is chosen. IMO, this Q isn't so unclear it needs to be closed. — gung - Reinstate Monica, Apr 27 '16 at 19:37
Most typical link functions yield lines (sets of predicted probabilities) that are very similar. You can also mimic a different curve w/ splines. It may help to read my answers here: [Difference between logit and probit models](http://stats.stackexchange.com/a/30909/7290) & here: [Is the logit function always the best for regression modeling of binary data?](http://stats.stackexchange.com/a/48137/7290) — gung - Reinstate Monica, Apr 27 '16 at 19:40
@C11H17N2O2SNa: "Single index" models are an example of a case in which the link is estimated. In fact, it's non-parametric! See for example http://arxiv.org/pdf/1506.08910v1.pdf. — Andrew M, Apr 28 '16 at 01:55
See here: http://stats.stackexchange.com/questions/142338/goodness-of-fit-and-which-model-to-choose-linear-regression-or-poisson/142353#142353 for a case where the link function do matter. — kjetil b halvorsen, Apr 28 '16 at 08:12
@AndrewM, thanks for the link, I should have mentioned that single index models were what I had in mind. — alfalfa, Apr 28 '16 at 12:03

score 10 · Accepted Answer · edited Dec 07 '21 at 15:37

If you're fitting only nominal categorical predictors (and models of full order), the link function will be of essentially no consequence --- in the sense that it doesn't alter the fit.

Here's an example using log and identity links with a Poisson glm. First the data (y is the response, a count, and x1f and x2f have the levels of the factors):

     y    157  909  249  144  876  248   34  205   62   26  243   48
     x1f    1    1    1    1    1    1    2    2    2    2    2    2
     x2f    1    2    3    1    2    3    1    2    3    1    2    3

Here's the fitted values for the full model with interaction:

    fitted(glm(y ~ x1f+x2f+x1f:x2f, family=poisson(link="log")))
        1     2     3     4     5     6     7     8     9    10    11    12 
    150.5 892.5 248.5 150.5 892.5 248.5  30.0 224.0  55.0  30.0 224.0  55.0 

fitted(glm(y ~ x1f+x2f+x1f:x2f, family=poisson(link="identity")))
        1     2     3     4     5     6     7     8     9    10    11    12 
    150.5 892.5 248.5 150.5 892.5 248.5  30.0 224.0  55.0  30.0 224.0  55.0

We see the fit didn't change even though the link function did.

If you're fitting categorical models which leave some interactions out (such as a main effects only model), then the link function can matter, because under some link functions, those interactions may indeed disappear (leaving the smaller model suitable and more easily interpreted) --- but then those simpler, additive models won't be suitable for other link functions.

Continuing the earlier example, omitting the interactions:

    fitted(glm(y ~ x1f+x2f, family=poisson(link="log")))
            1         2         3         4         5         6         7         8 
    145.65183 900.94330 244.90487 145.65183 900.94330 244.90487  34.84817 215.55670 
            9        10        11        12 
     58.59513  34.84817 215.55670  58.59513 

    fitted(glm(y ~ x1f+x2f, family=poisson(link="identity")))
            1         2         3         4         5         6         7         8 
    238.72879 618.67616 268.07978 238.72879 618.67616 268.07978  21.90564 401.85300 
            9        10        11        12 
     51.25663  21.90564 401.85300  51.25663

Now we see the fitted values are indeed different. In this case, the log link gives a reasonable fit, but the identity link gives quite a poor fit.

If you're fitting continuous predictors, then it may matter quite a bit --- even ignoring the issue of interactions. One example would be with binomial GLMs --- in many cases, the fit with a probit and a cloglog link can look quite different, even though they both have $g$ taking $(0,1)$ to $\mathbb{R}$.

How much it might matter really depends on the specifics of the problem and your tolerance of deviation.

In many cases ease of interpretation matters more than differences in fit (at least where those differences tend to be small), but you have competition between how easy the link function is to deal with and how interpretable the linear predictor is, and you also have the issue of potential lack of fit: if your curve relating the mean of your binomial variates to the predictor(s) isn't symmetric, it might be much more interpretable to choose a more suitable link than to expand the model class.

+1 for the notion that interpretation is a primary motivation for choosing a link function, especially in binomial GLM. — Andrew M, Apr 28 '16 at 01:59
Thanks a lot for the answer. Can you clarify the first paragraph? I don't see why the link doesn't matter in that case. — alfalfa, Apr 28 '16 at 12:03
@alfalfa I've added some clarification of the intent, and an example that shows what I claim happens. — Glen_b, Apr 28 '16 at 14:05

Robustness of GLM to link function

1 Answers1

Linked