Why does logistic regression not require residuals to be normally distributed and homoscedastic the way linear regression does? Why does this not cause problems for estimating logistic regression models?
Asked
Active
Viewed 856 times
0
-
3How would you define residuals any way? If observed equals 0 or 1 then residuals must be bounded between $-$1 and 1 and so can't be normally distributed. Also, even a well-behaved binary response can't be homoscedastic. If the mean is almost 0 or 1 then the variance is small but if the mean is close to 0.5 the variance is much bigger. Otherwise put, logistic regressions aren't produced by minimizing the sum of squared residuals, so many of the ideas associated with least squares don't carry over from linear regression. – Nick Cox Jan 22 '18 at 13:33
-
I was thinking of the residuals in terms of the log odds which I don't think would be bounded between 0 or 1. – grig109 Jan 22 '18 at 13:43
-
2What exact definition do you have in mind therefore? I don't think your question can be understood without that. – Nick Cox Jan 22 '18 at 13:53
-
2The residuals on the log-odds scale are not defined, since the log-odds of 0 are infinity. It's also worth asking, what is your understanding of what the normality assumption in linear regression accomplishes? – Matthew Drury Jan 22 '18 at 15:16
-
1I don't think this is the same as the question about iid. Clearly related, but not the same. Something could be iid and not normal. – Peter Flom Jul 14 '18 at 12:02