Questions tagged [separation]

Separation occurs when some classes of a categorical outcome can be perfectly distinguished by a linear combination of other variables.

Separation (called by various names: "perfect-s", "complete-s", also "partial-s" or "quasi-s", and strongly related to the Hauck-Donner effect), is when all outcomes with a particular level of a categorical variable are greater (less) than some value C of a linear combination of predictor variables, and all outcomes with the other level are less (greater) than that same value C.

This phenomenon causes the maximum likelihood estimate (MLE) of coefficients in, e.g., logistic regression (and related variants) to diverge. Suppose we are regressing a completely separated dichotomous outcome on a single variable using logistic regression, the maximum likelihood estimate of the coefficient for that variable does not exist. This is because the MLE of that parameter tends towards infinity, and MLEs do not exist for asymptotic results. Separation causes further problems for Wald tests of those parameters.

171 questions
193
votes
10 answers

How to deal with perfect separation in logistic regression?

If you have a variable which perfectly separates zeroes and ones in target variable, R will yield the following "perfect or quasi perfect separation" warning message: Warning message: glm.fit: fitted probabilities numerically 0 or 1 occurred We…
user333
  • 6,621
  • 17
  • 44
  • 54
60
votes
1 answer

Logistic regression in R resulted in perfect separation (Hauck-Donner phenomenon). Now what?

I'm trying to predict a binary outcome using 50 continuous explanatory variables (the range of most of the variables is $-\infty$ to $\infty$). My data set has almost 24,000 rows. When I run glm in R, I get: Warning messages: 1: glm.fit: algorithm…
Dcook
  • 733
  • 1
  • 7
  • 8
48
votes
2 answers

Logistic regression model does not converge

I've got some data about airline flights (in a data frame called flights) and I would like to see if the flight time has any effect on the probability of a significantly delayed arrival (meaning 10 or more minutes). I figured I'd use logistic…
Daniel Standage
  • 1,109
  • 3
  • 13
  • 21
37
votes
4 answers

Why does logistic regression become unstable when classes are well-separated?

Why is it that logistic regression becomes unstable when classes are well-separated? What does well-separated classes mean? I would really appreciate if someone can explain with an example.
Jane Dow
  • 471
  • 1
  • 4
  • 3
27
votes
1 answer

Is there any intuitive explanation of why logistic regression will not work for perfect separation case? And why adding regularization will fix it?

We have many good discussions about perfect separation in logistic regression. Such as, Logistic regression in R resulted in perfect separation (Hauck-Donner phenomenon). Now what? and Logistic regression model does not converge . I personally…
Haitao Du
  • 32,885
  • 17
  • 118
  • 213
26
votes
1 answer

Understanding complete separation for logistic regression

Why does logistic regression not converge for a linearly separable data set? For linear separable data sets the model parameters go to infinity when mimizing the error function (according to Bishop2006, Pattern recognition and machine learning,…
Matthias
  • 303
  • 1
  • 3
  • 7
24
votes
2 answers

What is the probability that $n$ random points in $d$ dimensions are linearly separable?

Given $n$ data points, each with $d$ features, $n/2$ are labeled as $0$, the other $n/2$ are labeled as $1$. Each feature takes a value from $[0,1]$ randomly (uniform distribution). What's the probability that there exists a hyperplane that can…
22
votes
1 answer

Model selection with Firth logistic regression

In a small data set ($n\sim100$ ) that I am working with, several variables give me perfect prediction/separation. I thus use Firth logistic regression to deal with the issue. If I select the best model by AIC or BIC, should I include the Firth…
StasK
  • 29,235
  • 2
  • 80
  • 165
19
votes
3 answers

Analysis of Danish mask study data by Nassim Nicholas Taleb (binomial GLM with complete separation)

Recently, Nassim Nicholas Taleb made this post about the recent Danish mask study, a randomized controlled trial which concluded that the proportions of newly diagnosed coronavirus infections was not significantly different among the group with…
16
votes
1 answer

Seeking a Theoretical Understanding of Firth Logistic Regression

I am trying to understand Firth logistic regression (method of handling perfect/complete or quasi-complete separation in logistic regression) so I can explain it to others in simplified terms. Does anyone have a dummied-down explanation of what…
ESmith5988
  • 375
  • 2
  • 8
15
votes
3 answers

Intuition for Support Vector Machines and the hyperplane

In my project I want to create a logistic regression model for predicting binary classification (1 or 0). I have 15 variables, 2 of which are categorical, while the rest are a mixture of continuous and discrete variables. In order to fit a logistic…
TheGoat
  • 539
  • 2
  • 8
  • 21
14
votes
1 answer

Issue with complete separation in logistic regression (in R)

I am trying to fit a logistic regression model for business defaults. Apart from the dichotomous variable default, the data set includes some performance ratios. When estimating the model in R, the following warning…
Marti
  • 143
  • 1
  • 1
  • 6
13
votes
2 answers

Is R's glm function useless in a big data / machine learning setting?

I am surprised that R’s glm will “break” (not converge with default setting) for the following “toy” example (binary classification with ~50k data, ~10 features), but glmnet returns results in seconds. Am I using glm incorrectly (for example, should…
Haitao Du
  • 32,885
  • 17
  • 118
  • 213
11
votes
1 answer

Binomial glmm with a categorical variable with full successes

I am running a glmm with a binomial response variable and a categorical predictor. The random effect is given by the nested design used for the data collection. The data looks like this: m.gen1$treatment [1] sucrose control protein …
AtiQP
9
votes
1 answer

Enormous coefficients in logistic regression - what does it mean and what to do?

I get enormous coefficients during logistic regression, see coefficients with krajULKV: > summary(m5) Call: glm(formula = cbind(ml, ad) ~ rok + obdobi + kraj + resid_usili2 + rok:obdobi + rok:kraj + obdobi:kraj + kraj:resid_usili2 + …
Tomas
  • 5,735
  • 11
  • 52
  • 93
1
2 3
11 12