The coxzph()
function in the R package Survival
shows a significant p-value for violation of the proportional hazards assumption for a covariate included in a Cox model. However, a plot of scaled Schoenfeld residuals against time (below) shows no obvious correlation. For the sake of clarity, the covariate is categorized (there are three categories).
Could the significant p-value be related to a large sample size (500000) and number of events (~3000) allowing detection of a small unimportant correlation? If so, is it acceptable to make a judgement based on the plot rather than the coxzph()
output and include the covariate in the model (i.e. without stratifying or adding an interaction with time)?