2

I have two predictors in a binary logistic regression model, where both variables are on the same scale (standardized). I find it puzzling that the variable with a smaller OR is significant at p <. 05 (OR=1.3) when the one on the same scale and has a larger OR=2 is NOT significant in the same model.

More importantly, when the two variables are tested in separate models, the variable with the larger OR also shows a higher probability estimate for the outcome variable (75%) (vs. smaller: 60%)

Could someone please explain how/why this might be or if this could be diagnostic of a problem (multicollinearity?)

Results

f=with(data=imp, glm(Y~X1+X2, family=binomial(link="logit")))

s01=summary(pool(f1))

s01

                     est        se         t       df   Pr(>|t|) 

   (Intercept) -1.7805826 0.1857663 -9.585070 391.0135 0.00000000 
   X1           0.2662796 0.1308970  2.034268 390.4602 0.04259997  
   X2           0.6757952 0.3869652  1.746398 395.6098 0.08151794 

cbind(exp(s01[, c("est", "lo 95", "hi 95")]), pval=s01[, "Pr(>|t|)"])

                             est     lo 95     hi 95       pval
              (Intercept) 0.1685399 0.1169734 0.2428389 0.00000000
              X1          1.3051000 1.0089684 1.6881459 0.04259997
              X2          1.9655955 0.9185398 4.2062035 0.08151794
  • As you can see, s01 estimates are log odds ratios.

logOR=log(1.9655955)

logOR [1] 0.6757953

Despite X2 having larger OR and log odds ratio, it is not significant in the model.

Update

One possibility I have not tried is testing the mean difference between the two OR (or log OR?) using the following method: Statistical test for difference between two odds ratios?

I am curious if testing the difference between the two log OR might change the conclusions and would appreciate suggestions on how I might do so using the output above (getting the standard error of logOR for the two coefficients).

ksroogl
  • 383
  • 1
  • 11
  • 3
    Significance is not the same as effect size--of which the OR is an example. It is determined in logistic regression by comparing the OR to its standard error. Doesn't your computation produce and report standard errors of the estimates? What are they equal to? – whuber Jun 03 '18 at 20:34
  • SE for the non-significant predictor is in fact larger (0.4) vs. (0.13 for the significant predictor)! Given this, is it fair to conclude that the significant predictor (with smaller OR) predicts the binary outcome ABOVE and Beyond the other? Does the interpretation work the same with logistic regression? – ksroogl Jun 03 '18 at 20:41
  • As for viewing OR as a form of an effect size, is it your suggestion that I divide the OR by its SE? If not, would you have any particular suggestion? – ksroogl Jun 03 '18 at 20:56
  • Usually it's the log OR that is estimated and the SE refers to that. In that case, dividing the former by the latter gives a t-statistic that is either referred to a Student t distribution or a Normal distribution to determine its significance. That detail aside, what you want to compare is the sizes of those ratios, not the ORs themselves. – whuber Jun 03 '18 at 22:04
  • Thank you! Converting the OR to effect size d shows that effect size is also larger for the larger OR. Does this suggest that the p-values for the two variables are misleading? – ksroogl Jun 04 '18 at 11:52
  • 1
    No, because p-values and effect sizes tell you two completely different things. – whuber Jun 04 '18 at 13:31
  • Can you post your data? – gung - Reinstate Monica Jun 04 '18 at 18:05
  • Included the output but do let me know if more is needed. – ksroogl Jun 05 '18 at 00:42
  • (@gung) Included the output but do let me know if more is needed – ksroogl Jun 05 '18 at 23:31
  • I don't see your data included in the output. Your data.frame seems to be `imp`. Can you post that? – gung - Reinstate Monica Jun 10 '18 at 21:39
  • @gung, I see. I have never posted actual data up here and it is a rather large dataset with 20 imputed sets. Do you have a suggestion for how I could do that? – ksroogl Jun 11 '18 at 23:01
  • Hmmm, that implies this is pretty complicated & the issue may be due to any number of aspects of your data, model, etc. See if you can create a [reproducible example](https://stackoverflow.com/q/5963269/1217536) somehow. Otherwise, I'm not sure if this will be answerable. – gung - Reinstate Monica Jun 11 '18 at 23:43
  • @gung, makes sense. I was hoping that testing the mean difference between the two log odds ratios could be one more thing to try. I am stuck at the level of computing the SE though (as noted above). Would you be able to provide any pointers? – ksroogl Jun 12 '18 at 00:01
  • Please post a reproducible example. – gung - Reinstate Monica Jun 12 '18 at 00:07
  • re: earlier comment, does the output above warrant any interpretation or must every question be reproduced to an exact pattern? – ksroogl Jun 12 '18 at 01:22

0 Answers0