2

I have two variables in a dataset. First variable has a continuous value (AHI) and other one is a binomial variable (OSA_status) which I have created based on the value of my first variable (AHI). i.e. if AHI> 5 then OSA_status = 1 else OSA_status = 0. I am trying out a logistic regression with dependent variable, OSA_status (0/1) and independent variable, AHI. I was expecting AHI will be significantly associated with OSA_status as it is derived from the independent variable itself (AHI). But my result is as follows. Can someone please explain me why i got this result.

Call:
glm(formula = OSA_status ~ AHI, family = "binomial", data = pre_surgery)

Deviance Residuals: 
       Min          1Q      Median          3Q         Max  
-5.277e-04  -2.000e-08  -2.000e-08   2.000e-08   5.818e-04  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)
(Intercept)    517.7    36115.6   0.014    0.989
AHI           -104.6     7314.3  -0.014    0.989

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1.1829e+02  on 87  degrees of freedom
Residual deviance: 6.1704e-07  on 86  degrees of freedom
AIC: 4

Number of Fisher Scoring iterations: 25

Warning messages:
1: glm.fit: algorithm did not converge 
2: glm.fit: fitted probabilities numerically 0 or 1 occurred 

arshad
  • 751
  • 4
  • 13

2 Answers2

4

Your model seems to have perfect separation. Notice the large standard errors of the coefficient estimates, and the tell-tale warning messages:

Warning messages:
1: glm.fit: algorithm did not converge 
2: glm.fit: fitted probabilities numerically 0 or 1 occurred 

Your model never actually got fit by the software! As you provided a fixed cutoff of AHI > 5 for defining OSA_status, without any probabilistic element, this isn't very surprising. A logistic regression will have trouble with fitting when there actually is a cutoff that exactly distinguishes 2 groups.

For ways to deal with perfect separation, see for example this page and this page.

For the dangers of breaking a continuous predictor into discrete categories, see this page.

EdM
  • 57,766
  • 7
  • 66
  • 187
0

Are you sure that your OSA_status variable is coded as factor and the other AHI as numeric?

You could try the opposite and give a result? An Anova as following?

rstatix::anova_test(pre_surgery, AHI ~ OSA_status)

If you do not find any result here, it might be that you made a mistake when coding your OSA_status variable.

Léo Henry
  • 11
  • 3