Does the r coxph model process past AND current values for time-varying covariates?

Question

I am working on a survival analysis with time-dependent covariates, using the coxph function of the survival package in R.

The analysis is investigating the effect of an intervention on hospital admissions, adjusting for known confounders that change over time, through the inclusion of time-dependent covariates in the time-dependent version of the coxph function. In the example below EmergencyAdmission is a binary indicator of whether an admission has occurred at each time interval, and the covariates CareHomeStay and EDAttendance are binary indicators of whether a patient had a stay in a care home or ED attendance at each time interval. These two covariates are known confounders, in terms of their prior history.

results <- coxph(formula = Surv(tstart, tstop, EmergencyAdmission) ~ InterventionIndicator + CareHomeStay + EDAttendance, data = mydata)

My questions is about how coxph processes the covariates, does the hazard at time t depend on past AND current values of the covariate at time t or only of past values? The discussion on this similar question seems to indicate that it does process the full history of covariates including the current values of the covariates at time t, however I am seeking further clarity for the purpose of identifying whether it is appropriate to include these confounders in this way.

For example, prior history of ED Attendance is the confounder I want to include, not the current value, because ED Attendance may be related to the outcome (e.g. patient has an Emergency Admission and as a result has a subsequent ED Attendance in the same time period). Hence if the coxph function is processing ED Attendance as a covariate at the current time, this would be violating the rule highlighted in Therneau's vignette.

The key rule for time dependent covariates in a Cox model is simple and essentially the same as that for gambling: you cannot look into the future. A covariate may change in any way based on past data or outcomes, but it may not reach forward in time

In addition, this becomes exacerbated when I look to include prior history of Emergency Admission as a covariate, because EmergencyAdmission is the outcome. If the coxph processes current values of the covariates in the hazard at time t, then it must be wrong to include EmergencyAdmission as the outcome variable AND as a covariate. However if it only processes time-dependent covariates UP TO time t, then including it as a covariate would be correct (because it is processing only the prior history of the covariate)?

So I have two main questions

In determining the hazard at time t, does the coxph function process values of time-dependent covariates up to time t OR up to and including time t?
Based on the answer to 1), how should I incorporate these time-dependent covariates? (If the coxph function processes current values in addition to past values, my thoughts are to include these covariates with a lag (e.g. add a PriorEmergencyAdmission variable that indicates if an admission took place in the previous time period). Would this be the right approach?)

Any help appreciated!

If of any help, the time series data looks like this:

PatientID	tstart	tstop	InterventionIndicator	EmergencyAdmission	CareHomeStay	EDAttendance
1	0	30	0	1	0	1
1	31	60	0	0	1	0
1	61	90	1	0	0	1
1	91	120	1	1	0	1
2	0	30	0	0	0	1
2	31	60	0	0	1	0
2	61	90	1	1	0	1
2	91	120	0	0	0	0

score 1 · Accepted Answer · answered Jan 05 '21 at 22:31

In determining the hazard at time t, does the coxph function process values of time-dependent covariates up to time t OR up to and including time t?

Cox models use the instantaneous values of the covariates at time t to estimate relative hazards at tine t. It's best to think of those as the covariate values at a time infinitesimally before the event time t, to get around the looking-into-the-future problem you rightly note. Earlier values of time-varying coefficients might affect the probability of surviving up to time t, but by definition the hazard represents the instantaneous risk of an event given that you have survived that long already.

how should I incorporate these time-dependent covariates?

That completely depends on your understanding of the subject matter. You are free to incorporate prior covariate values into an instantaneous covariate value, to be considered at time t, in any way that makes sense. You must avoid survivorship bias. You also can't include covariate values that effectively happen after an event; as you note, you can't use the fact of an Emergency Admission as a covariate to predict that admission. You might, however, use the number of prior Emergency Admissions as a covariate for predicting subsequent Admissions. Whether that might make sense depends on an understanding of the subject matter.

Also important, your situation involves repeated events, so you have to make sure to handle those appropriately.

Thank you for this helpful answer, I have proceeded to use the prior values rather than current values of the confounding covariates. And thank you for the note on repeated events yes my model includes a cluster() term to account for the repeated observations — xtna, Jan 08 '21 at 16:15

Does the r coxph model process past AND current values for time-varying covariates?

1 Answers1

Linked

PatientID	tstart	tstop	InterventionIndicator	EmergencyAdmission	CareHomeStay	EDAttendance
1	0	30	0	1	0	1
1	31	60	0	0	1	0
1	61	90	1	0	0	1
1	91	120	1	1	0	1
2	0	30	0	0	0	1
2	31	60	0	0	1	0
2	61	90	1	1	0	1
2	91	120	0	0	0	0

PatientID	tstart	tstop	InterventionIndicator	EmergencyAdmission	CareHomeStay	EDAttendance
1	0	30	0	1	0	1
1	31	60	0	0	1	0
1	61	90	1	0	0	1
1	91	120	1	1	0	1
2	0	30	0	0	0	1
2	31	60	0	0	1	0
2	61	90	1	1	0	1
2	91	120	0	0	0	0

PatientID	tstart	tstop	InterventionIndicator	EmergencyAdmission	CareHomeStay	EDAttendance
1	0	30	0	1	0	1
1	31	60	0	0	1	0
1	61	90	1	0	0	1
1	91	120	1	1	0	1
2	0	30	0	0	0	1
2	31	60	0	0	1	0
2	61	90	1	1	0	1
2	91	120	0	0	0	0