I realize that this is a relatively well published/understood concept, but I'm unfamiliar with how the conclusion is derived. Could someone ELI5 why when the data are perfectly separable, the MLE for a logistic regression $\hat{\beta} = \infty$?
Asked
Active
Viewed 5,074 times
1
-
It is not clear what equation you are referring to. – Michael R. Chernick Apr 12 '17 at 03:48
-
1@MichaelChernick http://stats.stackexchange.com/questions/11109/how-to-deal-with-perfect-separation-in-logistic-regression – SmallChess Apr 12 '17 at 03:52
1 Answers
1
Imagine you have two data points:
- $x_1 = 1$, $y_1= 1$
- $x_2 = -1$, $y_2 = 0$
And let's say we're estimating $b$ in $P(Y= 1 \mid X = x) = \frac{e^{bx}}{1 + e^{bx}}$
How does $b=5$ do? Let $P_1$ and $P_2$ be the probability of correctly forecasting the 1st and 2nd point respectively. We have:
- $P_2 = P_1 = \frac{e^5}{1+e^5} = .9933$
Can we do better? Sure. How about $b=6$
- $P_2 = P_1 = \frac{e^6}{1+e^6} = .9975$
With perfect separation, you always have a higher likelihood by increasing $b$ so the maximum likelihood estimate (MLE) doesn't exist. In a sense, the MLE is $b = \infty$. See the nifty graphic on this answer: https://stats.stackexchange.com/a/224864/97925

Matthew Gunn
- 20,541
- 1
- 47
- 85