Shrinkage for the Cox model: how to select intensity $\lambda$?

Question

The 2nd edition of James et al. "An Introduction to Statistical Learning" (2021) contains a new chapter on survival analysis and censored data (Chapter 11). Section 11.6 discusses shrinkage for the Cox model. When it comes to selecting optimal shrinkage intensity $\lambda$ in regression or classification tasks, one often uses cross validation. To that end, one needs to measure the model's performance on a left-out fold. The text says this is nontrivial for the Cox model due to some observations being censored. Some guidelines are provided as to how model fit could be assessed nonnumerically. However, I do not see how exactly that would help in selecting optimal regularization intensity in a regularized Cox model. Thus my questions:

How to assess numerically out-of-sample performance of a (regularized) Cox model?
How to select optimal regularization intensity in a regularized Cox model?

(Answering 1. helps answer 2., but perhaps 2. can be answered in a different way altogether.)

score 2 · Answer 1 · answered Feb 02 '22 at 14:34

For Question 2, minimizing the cross-validated partial likelihood deviance seems to be the best way to choose the penalty. That's implemented for Cox models in the R glmnet package, and illustrated in Figure 11.7 of ISLR2. I think that the concordance index is also implemented in that package for cross-validation, but that's not a good choice for comparing among models as you do when choosing a penalty.

For Question 1, see the approach of Frank Harrell's calibrate() function in his rms package. First, choose a time point of interest. For the out-of-sample cases, find the estimated survival probabilities at that time from the model.* Compare those probabilities against corresponding "observed" survival probabilities for those cases, based on interpolations that take nonlinearities, interactions, and censoring into account. That "Hare" interpolation method is outlined here, with links to further references.

*How to find estimated survival probabilities seems to be glossed over in ISLR2. See this page for how the baseline survival function can be estimated from the (risk-adjusted) cumulative hazard function from the data. Then you just adjust that baseline survival function for any new set of covariates.

Very helpful, thank you! – Richard Hardy Feb 02 '22 at 14:40 — Richard Hardy, Feb 02 '22 at 14:40

Shrinkage for the Cox model: how to select intensity $\lambda$?

1 Answers1