0

The coxzph() function in the R package Survival shows a significant p-value for violation of the proportional hazards assumption for a covariate included in a Cox model. However, a plot of scaled Schoenfeld residuals against time (below) shows no obvious correlation. For the sake of clarity, the covariate is categorized (there are three categories).

Could the significant p-value be related to a large sample size (500000) and number of events (~3000) allowing detection of a small unimportant correlation? If so, is it acceptable to make a judgement based on the plot rather than the coxzph() output and include the covariate in the model (i.e. without stratifying or adding an interaction with time)?

Scaled Schoenfeld residuals against time

John
  • 23
  • 3
  • With this many cases, showing the individual residuals makes it hard to see the overall pattern. Please re-plot without the individual residuals, and emphasize the line showing the smoothed estimate of Beta(t). Also, change the y-axis limits to show more clearly the actual range over which Beta(t) seems to be changing. Right now, the y-axis goes from -10 to +10, representing hazard ratios between $5 x 10^-5$ to 22000! Center the y-axis at the overall estimate for the coefficient, and show maybe +/- 1 unit beyond in each direction. – EdM Apr 06 '21 at 13:33
  • Also, if this is a continuous covariate, you might be better off fitting with a flexible continuous fit (e.g., restricted cubic splines) first. Mis-specification of the functional form of a covariate can show up [like a PH violation](https://stats.stackexchange.com/q/379416/28500). If you want to simplify your presentation of the results, you can illustrate thereafter with examples having different particular values. – EdM Apr 06 '21 at 13:39

0 Answers0