1

Based on page 164-165 of [Collett:Modelling Survival data in medical research (Third edition-2014)] the Schoenfeld for individual test and for global test can be obtained as follows[1]Schoenfeld formula.

However, computing the Chi-square values using own code gives different answers as using cox.zph(res.cox). Survival package version used is survival_3.1-12

rm(list=ls()); set.seed(1234)
 library("survival")
 hmohiv<-read.table("https://stats.idre.ucla.edu/stat/r/examples/asa/hmohiv.csv", sep=",", header = TRUE)
 #______Schoenfeld Residuals from coxph()__________
 res.cox<-coxph( Surv(time, censor)~drug+age, data=hmohiv,ties="breslow")
 sresid <- residuals(res.cox, "schoenfeld")
 #______________Own code__________
 beta<-as.matrix( c (res.cox$coefficients[1], res.cox$coefficients[2]) )
 OrderedData <- hmohiv[order(hmohiv$time), ] #order the data according to the event times
 x_ordered<-as.matrix(OrderedData[ ,4:3]) #Consider x1 and x2.
 n<-nrow(OrderedData)
 Numerator1<-numeric(n);Numerator2<-  numeric(n);Denominator<-numeric(n)
 Num1<-  numeric(n);Num2<-  numeric(n);Den<-numeric(n)
 for(j in 1:n) { 
  Numerator1[j] <- x_ordered[j,1]*exp( x_ordered[j,]%*%beta) 
  Numerator2[j] <- x_ordered[j,2]*exp( x_ordered[j,]%*%beta)     
  Denominator[j]<-exp( x_ordered[j,] %*%beta) 
 }
 
 for(j in 1:n) {
   thistime<-OrderedData$time[j]
   riskset <- which(thistime<=OrderedData$time)
   Num1[j] <- sum(Numerator1[riskset])
   Num2[j]<- sum(Numerator2[riskset])
   Den[j] <- sum(Denominator[riskset]) 
 }
 
Schoenfeld<-as.matrix(cbind( (x_ordered[,1]-Num1/Den), (x_ordered[,2]-Num2/Den))  )
Schoenfeldr<- Schoenfeld[OrderedData$censor==1, ]


> OrderedData.Cens<- OrderedData[OrderedData$censor==1, ]
> ObservedTime<- OrderedData.Cens$time
> sch1 <- residuals(res.cox,"scaledsch")

> xx <- ObservedTime - mean(ObservedTime)
> #__________Schoefeld Individual test__________#
> temp1 <- xx %*% sch1 #sch1 : Scaled Schoenfeld residual
> temp2 <-  res.cox$var  # Variance covariance matrix
> test1.Value <- temp1[1] %*% t(temp1)[1] /
  ( length(ObservedTime) *temp2[1,1]* sum(xx^2) )
> test2.Value <- temp1[2] %*% t(temp1)[2] /
   ( length(ObservedTime) *temp2[2,2]* sum(xx^2) )
> #_________Global test_________________#
> temp1 <- xx %*%  Schoenfeldr # Unscaled Schoenfeld Residuals
> test.Global<- temp1 %*% temp2  %*% t(temp1) /
   ( sum(xx^2) / length(ObservedTime) )

> data.frame(test1.Value ,test2.Value,test.Global )
  test1.Value test2.Value test.Global
  0.2113885  0.01205375   0.2502792

> test.ph <- cox.zph(res.cox)
> test.ph
          chisq df    p
drug   4.98e-06  1 1.00
age    8.05e-02  1 0.78
GLOBAL 8.64e-02  2 0.96
  • Please edit your question to include the publication date of the reference and the `survival` software package version that you used. The details of `cox.zph()` calculations changed in the past few years. See [this answer](https://stats.stackexchange.com/a/478957/28500) for some details. – EdM Jun 20 '21 at 19:56
  • 1
    Thanks for updating the information on the reference. Please report the version of `survival` that _you_ used, e.g. reported by the `sessionInfo()` function. You seem to be using a more recent version than the book, as your `cox.zph()` doesn't report a "rho" value like the older versions did. – EdM Jun 20 '21 at 20:26
  • Survival package version used is survival_3.1-12. This has been updated in the question. – John Majimboni Jun 20 '21 at 20:30
  • @EdM Do you have a reference for formulation procedure of the actual score test? – John Majimboni Jun 21 '21 at 10:55
  • I'd recommend looking at the most recent `cox.zph` code. An answer below has a link to an early version of the prior code. – EdM Jun 21 '21 at 12:09

1 Answers1

3

This isn't going to work, for three reasons

  1. That's not the test that cox.zph does. It hasn't been for years (as the comments point out). It now does the actual score test.

  2. You're looking at the correlation between the Schoenfeld residuals and time; the default in cox.zph is to use the Kaplan-Meier estimator instead of just time.

  3. You can't get around the issue of ties in this setting just by using ties="breslow" because cox.zph always uses ties="efron".

If you fix the second issue by comparing to cox.zph(res.cox, transform="identity") it will be a lot closer.

Alternatively, the old code that does the test you are doing is here

Thomas Lumley
  • 21,784
  • 1
  • 22
  • 73