8

In the idealized logistic model, we obtain an S-shaped curve linking each continuous IV to the DV. But in practice this S-shape infrequently occurs, making the logistic approach seem a little less superior for such types of data. Of course predicted probabilities that each observation will be "1" on the DV are usable in logistic and not in OLS regression, since in the latter these probabilities can exceed the bounds of [0,1]. But, for exploratory purposes, and if we don't need predicted probabilities, how sound is it to use OLS to see which IV have strong vs. moderate vs. weak relationships with the DV? Wouldn't this amount to a sort of multivariate version of point-biserial correlation? (Standardized regression coefficients, not to mention collinearity statistics and partial plots, are all I think more easily obtained in OLS than in logistic.)

rolando2
  • 11,645
  • 1
  • 39
  • 60

1 Answers1

7

If the explanatory variables have values over the entire real line it makes little sense to express an expectation that is a proportion in $[0,1]$ as a linear function of variable defined over the entire real line. If the sigmoid shape of the logit transformation doesn't describe the shape then perhaps it is best to search for a different transformation that maps $[0,1]$ into $(-∞ , ∞)$.

Macro
  • 40,561
  • 8
  • 143
  • 148
Michael R. Chernick
  • 39,640
  • 28
  • 74
  • 143
  • 4
    +1. To add to the last thing Michael said, probit and complimentary log-log are two other functions that map $(0,1)$ to $(-\infty, \infty)$ that are implemented in many software packages. – Macro May 30 '12 at 12:14
  • 3
    Note also, that just about any function which corresponds to a CDF for some real value random variable is a candidate. Logistic, Probit, and C-log-log are three such function (hyperbolic secant, normal and extreme value random variables). So you could also "in principle" use a skew-normal link function, or double exponential, or t, etc. etc. T distribution is useful when degrees of freedom is treated as unknown, as you can approximately balance between probit an logit link function. – probabilityislogic May 30 '12 at 12:50
  • @probabilityislogic, you've made an important point but nitpick: I think the logistic function is the (inverse) CDF of the logistic distribution, not the hyperbolic secant distribution. – Macro May 30 '12 at 12:58
  • Thanks to you all. Does it follow from your answers that you would practically never use point-biserial correlation? – rolando2 May 31 '12 at 23:59
  • Happened on the following: "OLS regression. When used with a binary response variable, this model is known as a linear probability model and can be used as a way to describe conditional probabilities. [...] For a more thorough discussion of [...] problems with the linear probability model, see Long (1997, p. 38-40). Long, J. Scott (1997). Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks, CA: Sage Publications." http://www.ats.ucla.edu/stat/stata/dae/logit.htm – rolando2 Jun 12 '12 at 01:50
  • The linear probability model is pretty common in applied social science research. It might not be "theoretically" sound but Horrace & Oaxaca (2006) show that its bias depends on the number of predictions it makes outside of $[0,1]$. Also remember the logit is approximately linear around 0.5. So if your sample proportion isn't too far from 50% and the estimates are on the order of 0.01 (a common enough scenario) it's a decent first look at a first-order model. – shadowtalker Dec 24 '13 at 04:20
  • A JAMA study on post-hospital mortality and readmission uses the linear probability model, with logistic regression used as a check in a sensitivity analysis. Authors cite 2 reasons: computational efficiency given >1000 predictors and avoiding quasi- or complete separation. http://jamanetwork.com/journals/jamainternalmedicine/fullarticle/2593255 – rolando2 Dec 20 '16 at 15:01
  • For more, see https://stats.stackexchange.com/questions/304437/why-do-researchers-in-economics-use-linear-regression-for-binary-response-variab – rolando2 Sep 22 '17 at 17:31
  • https://statisticalhorizons.com/when-can-you-fit argues that " the linear probability model can be used whenever the relationship between probability and log odds is approximately linear over the range of modeled probabilities. Probabilities between .2 and .8 are one range where approximate linearity holds [...]" – rolando2 Apr 24 '18 at 17:32