Interpretation of interactions significance between factors in a cox model

Question

I have two variables X and Y. X is continuos and Y is a factor (0,1). Individual regresions for each variables gives the following significant results in terms of increase of mortailty:

coxph(formula = Surv(SUPERV, STATUS) ~ X)

  n= 50, number of events= 30 

          coef exp(coef) se(coef)     z Pr(>|z|)  
X         0.5391    1.7144   0.2386 2.26   0.0239 *

and,

Call:
coxph(formula = Surv(SUPERV, STATUS) ~ Y)

  n= 50, number of events= 30 

       coef exp(coef) se(coef)     z Pr(>|z|)   
Y      1.2457    3.4754   0.4199 2.967  0.00301 **

However, when on a multiple regression the X variable loses its significance:

coxph(formula = Surv(SUPERV, STATUS) ~ X + Y)

  n= 50, number of events= 30 

              coef exp(coef) se(coef)     z Pr(>|z|)  
X             0.3110    1.3648   0.2733 1.138   0.2550  
Y1            0.9747    2.6505   0.4628 2.106   0.0352 *

And, when I add the interaction term X:Y nothing is significant:

Call:
coxph(formula = Surv(SUPERV, STATUS) ~ X+Y+X:Y)

  n= 50, number of events= 30 

              coef exp(coef) se(coef)      z Pr(>|z|)
X            0.4418    1.5555   0.3517  1.256    0.209
Y1           1.9000    6.6860   1.4907  1.275    0.202
X:Y1        -0.3771    0.6858   0.5804 -0.650    0.516

But if I study only the interaction term it is significant:

Call:
coxph(formula = Surv(SUPERV, STATUS) ~ X:Y)

  n= 50, number of events= 30 

             coef exp(coef) se(coef)     z Pr(>|z|)  
X:Y0       0.2604    1.2975   0.2873 0.907   0.3646  
X:Y1       0.5805    1.7869   0.2327 2.494   0.0126 *

I do not understand how to interpret the lack or increase of significance when interactions are considered. Why the adding Y to the regression X loses its significance? Why the regression of the interaction is significant ~X:Y but no X+Y+X:Y?

Is there a strong association between X and Y? For example, what are the mean and SD of X for the two Y groups? — EdM, Aug 14 '20 at 13:39
Yes, they have very different mean=15.6 sd = 7.0 and vs mean = 9.0 SD = 10.0. Could only that one explain this behaviour? — Greynes, Aug 17 '20 at 16:13

score 1 · Answer 1 · answered Aug 17 '20 at 19:16

In this particular case the strong association of your continuous predictor X with the categorical predictor Y is probably why you found this behavior. Y seems very strongly associated with outcome, based on the single-predictor and additive 2-predictor models. You can think about that additive model as showing the association of Y with outcome while also accounting for X. As you might often see in such situations, Y was still "significant" but its hazard ratio wasn't quite as large as it was when you didn't take X into account.

The single-predictor association of X with outcome might simply have represented its association with Y, which wasn't taken into account in the model including only X as a predictor. On the other hand, X might actually be associated with outcome, consistent with its positive hazard ratio point estimate in the additive model, but with this amount of data the error in that point estimate is high so you can't reliably make that statement for sure.

When you add an interaction you are adding a third predictor to the model. A rule of thumb for properly fitting a Cox survival model is to have about 15 events per predictor that you are considering in the model. So with 30 events you are pushing past that limit when you try to go beyond 2 predictors. You will see this type of thing often when you try to push a regression model beyond the available data: effects that are "significant" in an additive model disappear when you try to add an interaction.

You shouldn't try to get around this problem by including the interaction without the main effects of X and Y, as such practice is generally not a good idea.

What you saw here could also be seen in ordinary least squares: an apparent association of a predictor with outcome that might primarily be due to its association with a better predictor that was left out of the model. That's one example of what's called omitted-variable bias.

Cox models can even show bias if an omitted predictor isn't associated directly with the one you're evaluating, just with outcome. That tends to have the opposite effect from what you saw: you underestimate the magnitude of the coefficient for the predictor you're evaluating, like the omitted-variable bias in logistic regression.

So as you continue with this type of work it's generally best to use your knowledge of the subject matter to start modeling with as many predictors as are compatible with the number of events you have. If you are willing to use penalization methods like ridge regression, elastic net, or LASSO, you could include even more predictors.

Interpretation of interactions significance between factors in a cox model

1 Answers1