1

I understand that binary logistic regression is applied to binary classification problems where the dependent variable $Y$ has only two possible outcomes. The independent variables are $x$. The result of logistic regression is assigning a probability $p$ to one of the two outcomes and a probability $p-1$ to the other possible outcome.

I am confused on how the linear combination of the independent variables $w_1 x_1 +w_2 x_2 +w_3 x_3 $, $log \frac {p} {1-p}$, the probability $p$, the logistic function $\frac {1}{1+e^{-x}}$ are connected to each other.

Can someone help me logically understand how these concepts go together so I can finally appreciate how logistic regression works?

Thank you!

  • 1
    Could you make your question more specific by indicating which aspects are *not* covered in other threads here on CV, such as https://stats.stackexchange.com/questions/29325, https://stats.stackexchange.com/questions/52825, https://stats.stackexchange.com/questions/34636, and https://stats.stackexchange.com/questions/133623? – whuber Dec 14 '21 at 22:40
  • Briefly; the two functions are affectionately called "logit" and "expit". They are inverses of each other. To Dave's answer, the algebra is greatly simplified to recall the rule x/(1-x) = y implies y/(y+1) = x with some conditions on x. – AdamO Dec 15 '21 at 06:23

1 Answers1

2

This is the logistic regression model, where the log-odds are posited to change as a linear function of some predictors.

$$ \log\bigg( \dfrac{p}{1-p} \bigg) = X\beta $$

$X\beta$ is the linear combination. You denote it as $w_1 x_1 +w_2 x_2 +w_3 x_3 $. A more traditional way to write it would use $\beta$ as the symbol for coefficients and would involve an intercept, so more like: $$X\beta = \beta_0 +\beta1x_1 + \beta_2x_2+\beta_3x_3$$

In order to solve for $p$, we must do some algebra.

$$ \log\bigg( \dfrac{p}{1-p} \bigg) = X\beta\implies\\ \dfrac{p}{1-p} = \exp(X\beta)\implies\\ p = (1 - p) \exp(X\beta)\implies\\ p = \exp(X\beta) - p \exp(X\beta)\implies\\ p+p\exp(X\beta) = \exp(X\beta)\implies\\ p(1 + \exp(X\beta)) = \exp(X\beta)\implies\\ p = \dfrac{\exp(X\beta)}{1 + \exp(X\beta)\implies}\\ p = \bigg( \dfrac{1 + \exp(X\beta)}{\exp(X\beta)} \bigg)^{-1}\implies\\ p =\bigg( \dfrac{1}{\exp(X\beta)} + 1 \bigg)^{-1}\implies\\ p =\bigg( \exp(-X\beta) + 1 \bigg)^{-1}\implies\\ p = \dfrac{1}{1 + \exp(-X\beta)} $$

Dave
  • 28,473
  • 4
  • 52
  • 104
  • Great help. The starting point, as you mention, is that "the log-odds are posited to change as a linear function of some predictors". I follow the derivation. But, naively, what the purpose of take the log of the odds and setting that equal to the linear combination of the independent variables? – Brett Cooper Dec 15 '21 at 02:13
  • That’s called the “link function” of the generalized linear model, and you can use other link functions with Binomial $y_i$ variables. Probit regression uses the standard normal quantile (inverse CDF) function, and [other inverse CDFs are viable, too](https://stats.stackexchange.com/q/505573/247274). – Dave Dec 15 '21 at 02:19