selecting a link function for GLM's

Question

If you don't care about using GLM model parameters to predict anything, but simply want to select the best-fitting model for your data, is it necessary to get into the theoretical debate as to which link function to use? Is it OK to simply select the link function that gives you the lowest deviance?

Specifically, I am running an ordinal GLMM, and have found that the log-logistic link function in the software program called SuperMix fits my data the best.

Let's try a role reversal: You're in a seminar, or reviewing a paper, or examining a thesis, and the other person says "I chose such-and-such link because it gave the best fit, and nothing else really matters". Do you think "Fine by me"? I would want to ask lots of questions, not least: does that mesh with the known science here? can you show graphically that the model looks about right? have you looked at residuals and failed to find patterns? what other models did you think about and do they perform as well or much worse? are you confident that you have a defensible set of predictors? — Nick Cox, Sep 08 '14 at 17:07
Thanks, Nick. I've been reading about link functions and finding it hard to wrap my mind around how they are used. As I understand it, logit, probit, and complementary log-log cumulative probability distributions are all similar-looking, but some have heavier tails and/or are asymmetric. Looking more in depth at the log-log CDF, it appears to be completely the opposite relationship to the other three - decreasing (instead of increasing) probability with increasing x. (cont'd in next post...) — Cynthia Tedore, Sep 08 '14 at 23:31
(cont'd from previous post) Anyway, when you specify one of these as your link function, its CDF is the function you want to fit your data to, but since estimating procedures are difficult for binary or ordinal data, the link function transforms the the dependent variable to be linear so that OLS can be done on it. Does that all sound correct? I've read gung's excellent post on this topic, and I think I understand the argument for theoretical considerations, but what if theoretical considerations are not sufficient to allow you to choose between link functions? Then how do you choose? — Cynthia Tedore, Sep 08 '14 at 23:32
OLS is irrelevant here. You don't transform the dependent variable, as (e.g.) logit 0 or logit 1 is indeterminate. The link function is not identical to the model being fitted. It's hard to put a good explanation in a comment. You seem to be changing the question.... — Nick Cox, Sep 08 '14 at 23:37
Sorry, just trying to understand the underlying logic of a link function so as to better understand which one to choose... So lost... — Cynthia Tedore, Sep 09 '14 at 00:41

selecting a link function for GLM's

0 Answers0