1

I'm running a Cox PH model using lifelines package on Python. I tried the model on a train and holdout set.

These are the scores it gave:

cph.score(holdout_x, scoring_method='concordance_index') = 0.6892575904820802

cph.score(holdout_x, scoring_method='log_likelihood') = -5.160975066321637

cph.score(train_x, scoring_method='concordance_index') = 0.6948730045908684

cph.score(train_x, scoring_method='log_likelihood') = -5.719069733545282

I was wondering, what is a good rule of thumb (if any) to interpret the partial log-likelihood? Let's say, here it says that a "dumb" log-loss for the binary case would range between 0.1 and 0.8, depending on the prevalence. Mine is very different, but I am aware it is also calculated quite differently.

amestrian
  • 233
  • 1
  • 7
  • I am not very familiar with Cox PH analysis, but in general the log-likelihood will get more and more negative as the sample size increases. Also, it is usually difficult to interpret the log-likelihood by itself, and it's mostly used as an ingredient for model comparison (see for example LRT, BIC or AIC) – PedroSebe Nov 12 '20 at 03:25
  • That makes sense, both things. So in this case maybe the concordance index is a bit better at "measuring" how good the model is... – amestrian Nov 12 '20 at 15:36

1 Answers1

1

For modeling a given data set, partial log-likelihood is useful for things like evaluating hyper-parameter choices in cross-validation. As a comment on this question notes, however, the actual value of partial log-likelihood for a Cox model depends on sample size, so there is no single "reasonable value."

What constitutes a "reasonable value" of a concordance index depends on the particular field of application. The concordance is the proportion of case pairs for which the predicted order of events agrees with the observed order, so you need to use your knowledge of the subject matter to decide what's "good."

It's even more important to gauge how well your model might be expected to apply to new samples from your population of interest. This page and its links provide an introduction to such approaches. That provides the opportunity to go beyond how well the ordering of event times provided by the concordance is handled, to more specific measures of model quality like the precision of estimated probabilities of survival to specified times of interest.

EdM
  • 57,766
  • 7
  • 66
  • 187