3

I am looking for tips for adapting usual survival analysis software to recurrent events analysis, within the framework of the Cox regression. Specifically: whether it is always possible to do? What are the limitations? Can this be done for the birth& death process?

If you can offer software, where the analysis of recurrent events is already implemented, it could be helpful as well. (At the moment I am using pysurvival package that does not provide special capability for recurrent events analysis.)

Roger Vadim
  • 1,481
  • 6
  • 17

1 Answers1

3

The classic book by Terry Therneau and Patricia Grambsch, "Modeling Survival Data: Extending the Cox Model," devotes chapter 8 to modeling multiple events per subject. It covers recurrent events of the same type, ordered and unordered events, competing events, and multi-state models. If you can't get a copy, much of that material is also in a Mayo Clinic Technical Report by Therneau.

Briefly, there are two major issues that need to be addressed. First:

A major issue in extending proportional hazards regression models to this situation is intrasubject correlation. (Therneau and Grambsch, page 169)

One accepted way to deal with this is to do standard modeling but use robust estimates of coefficient (co)variances.

Second is the type of recurrence pattern you expect. For example, if there is a single event type for which the time clock resets to 0 at each event (a renewal process), then you might use the simple (time,status) data format,* where time is recorded as time since study entry for the first event for an individual and the time since the prior event for the second and later events. You then do a standard Cox regression, with an id variable keeping track of which events correspond to which individual so that you can construct the robust error estimates.

For more complicated situations you will probably need to use a counting-process data format in which you specify (timeStart,timeStop,status) for the period leading up to each event or censoring time. That format (also used for time-dependent covariate values) is available in the Python lifelines package along with a cluster variable option to handle robust variance estimates. So I suspect that could work for recurrent events of a single type.

For competing-event or multi-state models, the status in the (timeStart,timeStop,status) format needs to be a multi-category factor rather than a simple binary event/censored indicator, and models need to be specified for each allowed transition between states. I don't think that lifelines handles such models, but they are handled in the R survival package. A vignette in the package outlines how to implement such models.


*This specification of a single time (since time=0) and a binary event/censoring status seems to be the only format allowed by pysurvival, at a first glance.

EdM
  • 57,766
  • 7
  • 66
  • 187