Predicted values from a logistic regression?

Question

I'm doing a logistic regression, which I understand I can do by simply saying

$$ \operatorname{logit}(Y)=\beta_0+\beta_1 x+\varepsilon $$

where $\varepsilon$ is normally distributed around $0$. Then then we can use the usual OLS methodology to fit the $\beta$s, and when we set $\varepsilon =0$, this gives us our best estimate $\widehat{\operatorname{logit}(Y)}$.

My question is, how can we find $\hat Y$ from here. I think that it isn't as simple as $\hat Y=\operatorname{logit}^{-1}\left(\widehat{\operatorname{logit}(Y)}\right)$, because I know by analogy, $\hat Y=\exp\left(\widehat{\log(Y)}+\frac{1}{2}\sigma^2\right)$.

I looked up a logit-normal distribution (https://en.wikipedia.org/wiki/Logit-normal_distribution), but it says that there's no analytical solution for the mean of such a distribution. But I think I must be missing something because what good is the logistic regression if not to estimate $Y$.

It might help to review the basic concepts; in [logistic regression](https://en.wikipedia.org/wiki/Logistic_regression) the logit transform is of the mean, rather than of the data (which means it works on data consisting only of 0 and 1, for example). See also the sections on the generalized linear model relating to [intuition](https://en.wikipedia.org/wiki/Generalized_linear_model#Intuition) and the following overview section. There are many useful posts on site relating to logistic regression — Glen_b, May 29 '17 at 02:51
Possible duplicates: [How to specify a logistic regression as a transformed linear regression](https://stats.stackexchange.com/questions/162251/how-to-specify-logistic-regression-as-transformed-linear-regression) and [logit link in glm and inverse logit](https://stats.stackexchange.com/questions/262019/logit-link-in-glm-and-inverse-logit) — Glen_b, May 29 '17 at 03:05

score 4 · Accepted Answer · answered May 28 '17 at 06:53

Your understanding of logistic regression has some errors.

The logistic regression equation is

$$ \operatorname{logit}(E(Y))=\beta_0+\beta_1 x $$

Notice, there is no random part of the model on the right hand side. The linear part estimates the logit of the expected value of $Y$ exactly.

The randomness comes from how $Y$ disperses around it's expectation. To write the model explicitly in your style, you would have to write something like

$$ Y \mid x = \operatorname{Bernoulli}\left(p = \operatorname{logit}^{-1}(\beta_0+\beta_1 x) \right) $$

As a consequence, you cannot use OLS technology to fit a logistic regression. Logistic regressions are fit using iterative optimization, usually based off Newton's method.

Predicted values from a logistic regression?

1 Answers1