1

I am running (multi-level) logit models on hospital data testing whether the ratio two hospital tariffs has any effect on the probability of being admitted to the hospital.

My models are the following:

f1 <- glm(admit ~  log_ratio, data=dat, family = "binomial")
f2 <- glm(admit ~  log_ratio*as.factor(Condition2), data=dat, family = "binomial")
f3 <- lmer(admit ~  log_ratio + (1|provider_id), data=dat_first, REML = F)
f4 <- lmer(admit ~  log_ratio + log_ratio:as.factor(Condition2) + (1|provider_id), data=dat, REML = F)
f5 <- lmer(admit ~  log_ratio*as.factor(Condition2) + (1|provider_id), data=dat, REML = F)

where admit is a binary variable indicating hospital admission, log_ratio is the log-transformed ratio of the two tariffs, Condition2 is the condition for which the patient received treatment and provider_id is the id of the treatment provider.

My results are the following:


============================================================================================================
                                                                   Dependent variable:                      
                                              --------------------------------------------------------------
                                                                          admit                             
                                                     logistic                         linear                
                                                                                  mixed-effects             
                                                  (1)         (2)         (3)          (4)          (5)     
------------------------------------------------------------------------------------------------------------
log_ratio                                      0.605***     1.385*      0.012***     0.013***      0.012    
                                                (0.058)     (0.808)     (0.001)      (0.002)      (0.013)   
                                                                                                            
as.factor(Condition2)Brain Disorder                          1.392                                 -0.001   
                                                            (1.304)                               (0.021)   
                                                                                                            
as.factor(Condition2)Amputation                              2.352                                 0.0002   
                                                            (2.257)                               (0.072)   
                                                                                                            
as.factor(Condition2)Chronic pain                           4.403**                                -0.026   
                                                            (2.161)                               (0.055)   
                                                                                                            
as.factor(Condition2)Nervous system                          2.391                                 0.006    
                                                            (1.551)                               (0.026)   
                                                                                                            
as.factor(Condition2)Organ Disorder                          1.114                                 -0.018   
                                                            (1.787)                               (0.043)   
                                                                                                            
log_ratio:as.factor(Condition2)Brain Disorder               -0.098                   0.014***      0.014    
                                                            (0.825)                  (0.002)      (0.014)   
                                                                                                            
log_ratio:as.factor(Condition2)Amputation                   -0.067                   0.053***      0.053    
                                                            (1.741)                  (0.003)      (0.059)   
                                                                                                            
log_ratio:as.factor(Condition2)Chronic pain                 -1.926                   0.009***      0.024    
                                                            (1.230)                  (0.001)      (0.030)   
                                                                                                            
log_ratio:as.factor(Condition2)Nervous system               -1.537                   0.005***     -0.0003   
                                                            (1.050)                  (0.001)      (0.018)   
                                                                                                            
log_ratio:as.factor(Condition2)Organ Disorder                0.457                   0.023***      0.037    
                                                            (1.242)                  (0.001)      (0.032)   
                                                                                                            
Constant                                       -4.476***   -6.550***    0.022***      0.007        0.008    
                                                (0.075)     (1.293)     (0.008)      (0.009)      (0.023)   
                                                                                                            
------------------------------------------------------------------------------------------------------------
Observations                                    115,376     115,376     115,376      115,376      115,376   
Log Likelihood                                -12,589.260 -12,229.590  56,636.920   56,935.280   56,935.630 
Akaike Inf. Crit.                             25,182.530  24,483.190  -113,265.800 -113,852.600 -113,843.300
Bayesian Inf. Crit.                                                   -113,227.200 -113,765.700 -113,708.100
============================================================================================================
Note:                                                                            *p<0.1; **p<0.05; ***p<0.01

My main interest is in log_ratio and whether it has any association with the probability of hospital admission. My secondary interest is whether this correlation is different per condition. Log-ratio is significant in every model (logit or multi-level logit) except for when I do an interaction effect with condition (with all main effects in the model, such as in Models 2 & 5). The variables with log-ratio then lose their significance.

My question is which model do I believe? My gut feeling is that log-ratio is in fact significant, but due to small sample size per condition or some other reason it is not showing up as significant in Models 2 & 5. Could this be true?

Also, could Model 4 be an acceptable specification of the model? That is, in my case, do I need to have all main effects in the model?

Stata_user
  • 241
  • 1
  • 6
  • 1
    The question of “which” model you believe is not straightforward. You can compare some of the models using the anova() function. You could keep a hold out sample and test each model on the out-of-sample set. If your interaction term is significant then you probably keep it in the model and probe for which conditions the association is significant. – Matt Barstead Dec 26 '20 at 01:07

1 Answers1

0

Building off Matt Barstead's comment, looking at the fit information of model 3 vs model 4, I see no evidence that model 4 is the better model. The log likelihood is higher (with 5 additional degrees of freedom used), both AIC and BIC are higher, suggesting that model 4 is not doing a better job of prediction. You can use R's anova() function to test the two models. A quick and dirty calculation suggests that model 4 does not improve fit:

> 1-pchisq((-2*56636.920)-(-2*56935.280), 5)
[1] 0

Because the log likelihood of the more parsimonious model is smaller than the log likelihood from the less parsimonious model in your case, the chi-square test value of 0 is saying the more complicated model is no better than the less complicated model. See information on these tests here.

Erik Ruzek
  • 3,297
  • 10
  • 18
  • Thanks Erik and Matt for your comments. If I understand correctly this would mean that dividing my sample by condition and testing the relationship by condition is irrelevant, indicating that all conditions have a similar correlation with log_ratio variable. Is that the case? – Stata_user Dec 27 '20 at 11:13
  • Yes, but a clarification. It would be saying that all conditions have a similar admittance rate, after adjusting for log ratio. Further log ratio does not appear to modify the admittance rates for these conditions (interaction model). – Erik Ruzek Dec 28 '20 at 16:11
  • Thanks @Erik Ruzek. – Stata_user Dec 29 '20 at 17:22