I am working on a survival analysis with time-dependent covariates, using the coxph
function of the survival
package in R.
The analysis is investigating the effect of an intervention on hospital admissions, adjusting for known confounders that change over time, through the inclusion of time-dependent covariates in the time-dependent version of the coxph
function. In the example below EmergencyAdmission is a binary indicator of whether an admission has occurred at each time interval, and the covariates CareHomeStay and EDAttendance are binary indicators of whether a patient had a stay in a care home or ED attendance at each time interval. These two covariates are known confounders, in terms of their prior history.
results <- coxph(formula = Surv(tstart, tstop, EmergencyAdmission) ~ InterventionIndicator + CareHomeStay + EDAttendance, data = mydata)
My questions is about how coxph
processes the covariates, does the hazard at time t depend on past AND current values of the covariate at time t or only of past values?
The discussion on this similar question seems to indicate that it does process the full history of covariates including the current values of the covariates at time t, however I am seeking further clarity for the purpose of identifying whether it is appropriate to include these confounders in this way.
For example, prior history of ED Attendance is the confounder I want to include, not the current value, because ED Attendance may be related to the outcome (e.g. patient has an Emergency Admission and as a result has a subsequent ED Attendance in the same time period). Hence if the coxph function is processing ED Attendance as a covariate at the current time, this would be violating the rule highlighted in Therneau's vignette.
The key rule for time dependent covariates in a Cox model is simple and essentially the same as that for gambling: you cannot look into the future. A covariate may change in any way based on past data or outcomes, but it may not reach forward in time
In addition, this becomes exacerbated when I look to include prior history of Emergency Admission as a covariate, because EmergencyAdmission is the outcome. If the coxph
processes current values of the covariates in the hazard at time t, then it must be wrong to include EmergencyAdmission as the outcome variable AND as a covariate. However if it only processes time-dependent covariates UP TO time t, then including it as a covariate would be correct (because it is processing only the prior history of the covariate)?
So I have two main questions
- In determining the hazard at time t, does the
coxph
function process values of time-dependent covariates up to time t OR up to and including time t? - Based on the answer to 1), how should I incorporate these time-dependent covariates? (If the
coxph
function processes current values in addition to past values, my thoughts are to include these covariates with a lag (e.g. add a PriorEmergencyAdmission variable that indicates if an admission took place in the previous time period). Would this be the right approach?)
Any help appreciated!
If of any help, the time series data looks like this:
PatientID | tstart | tstop | InterventionIndicator | EmergencyAdmission | CareHomeStay | EDAttendance |
---|---|---|---|---|---|---|
1 | 0 | 30 | 0 | 1 | 0 | 1 |
1 | 31 | 60 | 0 | 0 | 1 | 0 |
1 | 61 | 90 | 1 | 0 | 0 | 1 |
1 | 91 | 120 | 1 | 1 | 0 | 1 |
2 | 0 | 30 | 0 | 0 | 0 | 1 |
2 | 31 | 60 | 0 | 0 | 1 | 0 |
2 | 61 | 90 | 1 | 1 | 0 | 1 |
2 | 91 | 120 | 0 | 0 | 0 | 0 |