5

From this answer, the following statement is posed: 'Though not "wrong", you'd want a good reason for using an identity link to model a Bernoulli probability.'

I would like to know what good reasons would result in an identity link being used for logistic regression, if these exist. Do these reasons generalise to GLMs?

Alex
  • 3,728
  • 3
  • 25
  • 46

1 Answers1

5

I frequently use the identity link to model a Bernoulli probability when I want to obtain adjusted risk differences or just an adjusted risk. If you wanted to obtain risk differences (e.g. $\hat{p_1}-\hat{p_2}$), and have no need to calculate odds ratios, this is the most straight forward way to calculate them. See this paper for additional details: http://aje.oxfordjournals.org/content/162/3/199.full

Logistic regression is a Generalized Linear Model. It does generalize.

StatsStudent
  • 10,205
  • 4
  • 37
  • 68
  • are these risk differences valid even if $\hat{p}_i$ are negative? – Alex Feb 25 '16 at 05:07
  • the paper does not make any sense to me. A concrete example would be really helpful, as I don't understand SAS code. – Alex Feb 25 '16 at 05:09
  • 1
    You shouldn't really need to understand the SAS to get the gist of this paper. Plus, I already told you what it said (you can use it for adjusted risk-differences). The authors simply show that they use an identity link with both a binomial and poisson regression to obtain adjusted risk ratios: "By replacing link=log with link=identity in the MODEL statement, multivariate-adjusted risk (prevalence) differences are obtained as follows" . . . " The model statements in SAS and R are nearly identical. You shouldn't have any trouble reading the SAS model statement that follows the text above. – StatsStudent Feb 25 '16 at 05:50
  • 1
    @Alex, how would you get a negative probability? In logistic regression, $\hat{p}$ is restricted to be between 0 and 1. – StatsStudent Feb 25 '16 at 17:36
  • 1
    Doesn't the identity link imply a linear regression with likelihood calculated using the logistic distribution? How are probabilities restricted to 0 and 1 if you extrapolate using this model? – Alex Feb 25 '16 at 22:37
  • 1
    Hi, @Alex, sorry, I misunderstand your previous reference to $\hat{p}$ -- I thought you were using in the logistic regression context with a logit link. Yes, using an identity link, you can indeed obtain risks that fall outside of 0 and 1 and the models can sometimes fail to converge. There have been methods proposed on how to handle a situation when that occurs. – StatsStudent Feb 25 '16 at 23:46
  • 2
    Also see: See for example, http://aje.oxfordjournals.org/content/123/1/174.abstract?ijkey=41bfebc687e24e42ce530c04be6defe98fa983dc&keytype2=tf_ipsecsha and http://aje.oxfordjournals.org/content/166/11/1337.abstract?ijkey=f7b96993742bfe3161e3fdd88fb45a2aeefeae81&keytype2=tf_ipsecsha – StatsStudent Feb 25 '16 at 23:46