3

I am constructing Cox models that predict survival in a clinical trials cohort.

After speaking to our statistician (who is away at the moment, hence this post), I was advised to take a forward likelihood ratio-test approach to building Cox survival models, starting with a base model and adding the term that improved the model, by computing a p value from subtracting the likelihood ratio statistic from the extended model from the likelihood ratio from the base model, as outlined in the R code below.

I realise that Stata is probably a better fit for this sort of analysis, but I i) I don't have easy access to Stata and ii) am familiar with R (also have access to SPSS), so with that caveat, here is the general format of the code I am using:

Conceptually, this makes sense to me if I add binary covariates to the model, but I was wondering whether this approach is appropriate for adding a continuous variable, as outlined below? I'm not sure that a degree of freedom equal to one is correct for this comparison?

y<-0:1
data<-data.frame(cbind(sample(y,100,replace=TRUE),runif(100,min=0,max=10),sample(y,100,replace=TRUE),runif(100,min=0,max=1)))
colnames(data)<-c("EFS_Status","EFS_Time","var1","Contvar2")   
library(survival)  
base <- coxph(Surv(EFS_Time,EFS_Status) ~ var1, data=data) # Create base model  
lr1 <- -2*base$loglik[2] # Likelihood ratio of base model

extend <- coxph(Surv(EFS_Time,EFS_Status) ~ var1 + Contvar2, data=data) # Extended model  
lr2<- -2*extend$loglik[2] # Likelihood ratio of extended model 

pchisq(q=lr1-lr2,df=1,lower.tail=FALSE) # 1 df correct for continuous variables?

Any guidance is most appreciated. Obviously, I could binarize the continuous variable, but I suppose you're losing information by taking that approach.

Thanks for reading, Ed

EdS
  • 165
  • 4

1 Answers1

3

Yes, a continuous variable added as a linear effect to the model formula as you have here adds one degree of freedom to the model, as there is one more model parameter to estimate.

By the way, instead of using pchisq to compute the p-value it's easier and less error-prone to use anova(). This will automatically calculate the correct degrees of freedom for the test. See ?anova.coxph.

onestop
  • 16,816
  • 2
  • 53
  • 83