5

I would like to generate a random number from a piecewise exponential distribution. I consider that the time-scale is divided in $J$ intervals with bounds $(s_{j-1},s_j]$, for $j=1,...,J$, and corresponding rates $\lambda_j$.

Considering the memoryless property of the standard exponential distribution, is it correct to generate the random number with the following procedure for $i=1,...,J$?

  1. Generate $x_i$ from a standard exponential distribution with rate $\lambda_i$;
  2. If $x_i<s_i$ the random number is obtained as $y_i<x_i+s_{i-1}$, otherwise continue with the next $i$.

In addition, the PDF of this piecewise exponential distribution is given by: $$ k(t)=\prod_{h=1}^{j-1}(e^{-\lambda_h(s_h-s_{h-1})})(\lambda_j)(e^{-\lambda_j(t-s_{j-1})})I(s_{j-1}<t\leq s_j) $$

  • 1
    What exactly does this distribution describe? Is it a distribution of times between events in a point stochastic process? Is it the intensity of a process as a function of time? Could it be something else? Please edit your question to clarify this key piece of information. – whuber Aug 02 '14 at 18:57
  • 2
    @whuber - Sorry I forgot it! It is a distribution of survival times! – Alessandro Beretta Aug 02 '14 at 19:46
  • 1
    When you say it's piecewise exponential, do you mean for the density to be continuous or discontinuous at the join-points? (That is, is the log-density a linear-spline, or is it just a set of distinct line-segments? It sounds from your comment just above as if it should be continuous) -- In either case the approach is simple, but either way involves a little "accounting" to keep everything straight. – Glen_b Aug 02 '14 at 23:56
  • @Glen_b - First of all, thanks for your answer! For simplicity I'm considering this distribution as discrete. I added in my first post its probability density function. – Alessandro Beretta Aug 03 '14 at 01:13
  • 1
    I'm confused -- the exponential is continuous (the discrete analog is geometric) -- and if it was somehow discrete, you wouldn't call $k$ its pdf. What am I misunderstanding? – Glen_b Aug 03 '14 at 01:27
  • @Glen_b - No, I think you are understanding more than me :) Sorry, you are right... So, I think it is discontinuous at the end of each interval... – Alessandro Beretta Aug 03 '14 at 01:32
  • 1
    A less general setting is discussed in http://stats.stackexchange.com/questions/73373/generating-survival-times-for-a-piecewise-constant-hazard-model-with-two-change – Michael M Aug 03 '14 at 07:14

2 Answers2

3

If you know the parameters, and thus you know the area of each segment, then you can choose a segment at random from the relevant multinomial distribution over $[1,2,\ldots,J]$.

Then you simply need to sample from a (shifted) truncated exponential appropriate to that segment.

There are a variety of approaches for the truncated exponential, depending on your needs (speed, ease of generation, any known facts about the likely width of the interval relative to the mean of the untruncated exponential, and so on).

If speed's important, and the distribution isn't changing from draw to draw, one might, for example, use a ziggurat-type approach. Or one can generate a truncated exponential by taking minus the log of a uniform on an appropriate subset of the unit interval.

Generating a truncated exponential via a rejection approach along the lines you suggest (I didn't check your pseudo-code was right but the basic idea can certainly be made to work) may in some circumstances be extremely inefficient.


Looking more closely at your pseudocode I think it is incorrect in several aspects.

Note that the interval from $s_{i-1}$ to $s_i$ is of length $s_i-s_{i-1}$, so you're truncate your generated exponential at $s_i-s_{i-1}$ before adding $s_{i-1}$ to produce your random value in the interval. Further, I think your expression after "the random number is obtained as" should have "=" rather than "<".

Glen_b
  • 257,508
  • 32
  • 553
  • 939
  • Thanks again for your answer! I appreciate it, but unfortunately I didn't understand it completely (I'm not a statistician). A part the fact that my pseudo-code is inefficient in terms of generation-time, do you think that it is correct from a theoretical point of view? – Alessandro Beretta Aug 03 '14 at 01:28
  • 2
    I'm sorry, I can't respond to this, since your previous comment about discreteness makes it clear I have no clear idea what you're even discussing. – Glen_b Aug 03 '14 at 01:30
  • Forgive me... You are right, even my ideas are not really clear, but I would like to try to understand more... What I'm sure 100% is that the hazard rate is constant and different in each time-interval... So, if I'm right looking at the PDF the distribution should be discontinuous at the end of each interval... – Alessandro Beretta Aug 03 '14 at 01:48
3

Following your suggestions I decided to use the inverse CDF method (more simple).

I know that the hazard rates are constant in the time-intervals $$ \lambda(t) = \begin{cases} \lambda_1, & \mbox{if } 0<t\leq t_1 \\ \lambda_2, & \mbox{if } t_1<t\leq t_2 \\ \vdots & \vdots\\ \lambda_J, & \mbox{if } t_{J-1}<t\leq t_J \end{cases} $$

that the cumulative hazard function is given by

$$ \Lambda(t) = \begin{cases} \lambda_1 t, & \mbox{if } 0<t\leq t_1 \\ \lambda_1 t_1+\lambda_2(t-t_1), & \mbox{if } t_1<t\leq t_2 \\ \vdots & \vdots\\ \sum_{i=1}^{J-1}\lambda_i(t_i-t_{i-1})+\lambda_J(t-t_{J-1}), & \mbox{if } t_{J-1}<t\leq t_J \end{cases} $$

and that the cumulative density function is given by $F(t)=1-e^{-\Lambda(t)}$.

So generating $u\sim U(0,1)$ (simulation of the CDF value) it is possible to obtain a number ($t$) which follows a piecewise exponential distribution. More in details, it is possible to use the inverse of the cumulative hazard function (as follows), calculating $x=-\ln(1-u)$,

$$ t=\Lambda^{-1}(x) = \begin{cases} \frac{x}{\lambda_1}, & \mbox{if } 0<x\leq \lambda_1t_1 \\ t_1+\frac{x-\lambda_1 t_1}{\lambda_2}, & \mbox{if } \lambda_1t_1<x\leq \lambda_1t_1+\lambda_2(t_2-t_1) \\ \vdots & \vdots\\ t_{J-1}+\frac{x-\sum_{i=1}^{J-1}\lambda_i(t_i-t_{i-1})}{\lambda_J}, & \mbox{if } \sum_{i=1}^{J-1}\lambda_i(t_i-t_{i-1})<x\leq \sum_{i=1}^{J}\lambda_i(t_i-t_{i-1}) \end{cases} $$

The only thing is missing is how to treat values of $x>\sum_{i=1}^{J}\lambda_i(t_i-t_{i-1})$.

Do you agree that this procedure to generate random number (piecewise exponentially distributed) is better?

Cam.Davidson.Pilon
  • 11,476
  • 5
  • 47
  • 75
  • For $x > \sum_{i=1}^J \lambda_i (t_i - t_{i-1})$, it's $t_J + \frac{(x - \sum_{i=1}^J \lambda_i (t_i - t_{i-1}))}{ \lambda_{J+1}}$ (that's right, you need one more $\lambda$ than you originally specified). – Cam.Davidson.Pilon Mar 26 '19 at 14:48
  • Here's a gist in Python: https://gist.github.com/CamDavidsonPilon/b0864ce6794fee57c12421d9b383562c – Cam.Davidson.Pilon Mar 26 '19 at 14:50