How to generate survival data with time dependent covariates using R

Question

I want to generate survival time from a Cox proportional hazards model that contains time dependent covariate. The model is

$h(t|X_i) =h_0(t) \exp(\gamma X_i + \alpha m_{i}(t))$

where $X_i$ is generated from Binomial(1,0.5) and $m_{i}(t)=\beta_0 + \beta_1 X_{i} + \beta_2 X_{i} t$.

The true parameter values are used as $\gamma = 1.5, \beta_0 = 0, \beta_1 = -1, \beta_2 = -1.5, h_0(t) = 1$

For time-independent covariate (i.e. $h(t|X_i) =h_0(t) \exp(\gamma X_i) $ I generated as follows

#For time independent case
# h_0(t) = 1
gamma <- -1
u <- runif(n=100,min=0,max=1)
Xi <- rbinom(n=100,size=1,prob=0.5)
T <- -log(u)/exp(gamma*Xi)

Can anyone please help me to generate survival data with time-varying covariate.

What sort of function is $m_i(t)$? Is it continuous? Piecewise constant? A different algorithm will probably be needed accordingly. — tristan, Jul 24 '14 at 20:42
$m_{i}(t)$ is a time dependent covariate, for simplicity you can consider a proportional relationship with time. — Sheikh, Jul 25 '14 at 13:42
how did you perform the R code from the above equation? means that At each death time within the same id the program needs to figure out what the covariates are for everyone which is either x is equal to 1 or 0. if all equal to 1 cumsum the hazard. after that calculate the survival function. lets it pick the right line for each subject. — Qas Amell, Jun 14 '15 at 04:47
As Z. Zhang points out then have a look at [this article](https://www.ncbi.nlm.nih.gov/pubmed/22763916). Further, you can see [my answer](https://stats.stackexchange.com/a/313343/81865) to his question where I show how to simulate for those in the $X_i = 1$ group in R. — Benjamin Christoffersen, Nov 12 '17 at 20:24

score 10 · Accepted Answer · answered Jul 29 '14 at 10:45

OK from your R code you are assuming an exponential distribution (constant hazard) for your baseline hazard. Your hazard functions are therefore:

$$ h\left(t \mid X_i\right) = \begin{cases} \exp{\left(\alpha \beta_0\right)} & \text{if $X_i = 0$,} \\ \exp{\left(\gamma + \alpha\left(\beta_0+\beta_1+\beta_2 t\right)\right)} & \text{if $X_i = 1$.} \end{cases} $$

We then integrate these with respect to $t$ to get the cumulative hazard function:

$$ \begin{align} \Lambda\left(t\mid X_i\right) &= \begin{cases} t \exp{\left(\alpha \beta_0\right)} & \text{if $X_i=0$,} \\ \int_0^t{\exp{\left(\gamma + \alpha\left(\beta_0+\beta_1+\beta_2 \tau\right)\right)} \,d\tau} & \text{if $X_i=1$.} \end{cases} \\ &= \begin{cases} t \exp{\left(\alpha \beta_0\right)} & \text{if $X_i=0$,} \\ \exp{\left(\gamma + \alpha\left(\beta_0+\beta_1\right)\right)} \frac{1}{\alpha\beta_2} \left(\exp\left(\alpha\beta_2 t\right)-1\right) & \text{if $X_i=1$.} \end{cases} \end{align} $$

These then give us the survival functions:

$$ \begin{align} S\left(t\right) &= \exp{\left(-\Lambda\left(t\right)\right)} \\ &= \begin{cases} \exp{\left(-t \exp{\left(\alpha \beta_0\right)}\right)} & \text{if $X_i=0$,} \\ \exp{\left(-\exp{\left(\gamma + \alpha\left(\beta_0+\beta_1\right)\right)} \frac{1}{\alpha\beta_2} \left(\exp\left(\alpha\beta_2 t\right)-1\right)\right)} & \text{if $X_i=1$.} \end{cases} \end{align} $$

You then generate by sampling $X_i$ and $U\sim\mathrm{Uniform\left(0,1\right)}$, substituting $U$ for $S\left(t\right)$ and rearranging the appropriate formula (based on $X_i$) to simulate $t$. This should be straightforward algebra you can then code up in R but please let me know by comment if you need any further help.

Thank you very much for the algebra. I will code up in R and will contact you for further help. — Sheikh, Jul 29 '14 at 17:07
what a perfect answer, @tristan. I had a similar question and found your answer. Just awesome. — Sam, Sep 04 '15 at 19:12
@tristan I'm a bit confused about the meaning of alpha in the first equation you give where Xi = 0. Would you mind expanding a bit on that? Thanks. — Statwonk, Aug 06 '16 at 17:07
@Statwonk it follows from the hazard rate equation provided by the original poster — tristan, Aug 07 '16 at 20:20
Sorry, but I am not sure how to use the function S(t) for simulating the times. I think you should compute S^{-1} and this function is not trivial for the case X_i=1. — Pmc, Jan 31 '17 at 18:13
@Pmc yes you either need to compute S^{-1} or solve it as an implicit equation, e.g. using numerical methods. You aren't the OP, are you answering the exact same question? Is it homework? — tristan, Feb 02 '17 at 06:38

How to generate survival data with time dependent covariates using R

1 Answers1