13

In Survival Analysis, you assume the survival time of a r.v. $X_i$ to be exponentially distributed. Considering now that I have $x_1,\dots,x_n$ "outcomes" of i.i.d r.v.'s $X_i$. Only some proportion of these outcomes are in fact "fully realized", i.e. the remaining observations are still "alive".

If I wanted to perform a ML estimate for the rate parameter $\lambda$ of the distribution, how can I utilize the non-realized observations in a coherent/appropriate manner? I believe they still contain useful information for the estimation.

Could someone guide me to literature on this topic? I am sure it exists. I am however having trouble finding good keywords/search terms for the topic.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Good Guy Mike
  • 611
  • 9
  • 22
  • 3
    So you are saying that from the $n$ random variables of which you have a measurement, say $n_1 < n$ observations represent "finalized" life-lengths (because, the associated random variables were "dead" at measurement time), while the rest $n_2 – Alecos Papadopoulos Jan 14 '15 at 09:48
  • Yes, thats precisely what I was trying to say! – Good Guy Mike Jan 14 '15 at 09:50
  • 1
    this is a truncated model, the "alive" random variables being truncated at the time the observation stops. – Xi'an Jan 14 '15 at 10:24
  • 1
    Check out [Tobit models](http://en.wikipedia.org/wiki/Tobit_model) for truncated data and related sources (e.g. [here](http://www.bauer.uh.edu/rsusmel/phd/ec1-24.pdf)). – Richard Hardy Jan 14 '15 at 10:29
  • 2
    You seem to have censored data, like lifetimes, where some people died, but some are still alive, such taht you only know that , say, $x_i > t_i$ for some known constant $t_i$. – kjetil b halvorsen Jan 14 '15 at 10:56
  • 3
    Beware of the sometimes subtle difference between the two situations. It is not uncommon for truncation to be confused for censoring, and vice-versa. – Alecos Papadopoulos Jan 14 '15 at 11:07
  • Yes, I first thought the terms were interchangeable, but I later realized that censored data was what I was actually looking for! – Good Guy Mike Jan 14 '15 at 14:13
  • Here is a very good book: Klein P.J., Moeschberger M.L. "Survival analysis - techniques for censored and truncated data", 2nd ed., 2003 Springer. – StijnDeVuyst Oct 19 '17 at 15:01

1 Answers1

21

You can still estimate parameters by using the likelihood directly. Let the observations be $x_1, \dots, x_n$ with the exponential distribution with rate $\lambda>0$ and unknown. The density function is $f(x;\lambda)= \lambda e^{-\lambda x}$, cumulative distribution function $F(x;\lambda)=1-e^{-\lambda x}$ and tail function $G(x;\lambda)=1-F(x;\lambda) = e^{-\lambda x}$. Assume the first $r$ observations are fully observed, while for $x_{r+1}, \dots, x_n$ we only know that $x_j > t_j$ for some known positive constants $t_j$. As always, the likelihood is the "probability of the observed data", for the censored observations, that is given by $P(X_j > t_j) = G(t_j;\lambda)$, so the full likelihood function is $$ L(\lambda) = \prod_{i=1}^r f(x_i;\lambda) \cdot \prod_{i=r+1}^n G(t_j;\lambda) $$ The loglikelihood function then becomes $$ l(\lambda) = r\log\lambda -\lambda(x_1+\dots+x_r+t_{r+1}+\dots+ t_n) $$ which has the same form as the loglikelihood for the usual, fully observed case, except from the first term $r\log\lambda$ in place of $n\log\lambda$. Writing $T$ for the mean of observations and censoring times, the maximum likelihood estimator of $\lambda$ becomes $\hat{\lambda}=\frac{r}{nT}$, which you yourself can compare with the fully observed case.

 EDIT   

To try to answer the question in comments: If all observations were censored, that is, we did not wait long enough to observe any event (death), what can we do? In that case, $r=0$, so the loglikelihood becomes $$ l(\lambda) = -nT \lambda $$ that is, it is linear decreasing in $\lambda$. So the maximum must be for $\lambda=0$! But, zero is not a valid value for the rate parameter $\lambda$ since it do not correspond to any exponential distribution. We must conclude that in this case the maximum likelihood estimator do not exist! Maybe one could try to construct some sort of confidence interval for $\lambda$ based on that loglikelihood function? For that, look below.

But, in any case, the real conclusion from the data in that case is that we should wait more time until we get some events ...

Here is how we can construct a (one-sided) confidence interval for $\lambda$ in case all observations get censored. The likelihood function in that case is $e^{-\lambda n T}$, which has the same form as the likelihood function from a binomial experiment where we got all successes, which is $p^n$ (see also Confidence interval around binomial estimate of 0 or 1). In that case we want a one-sided confidence interval for $p$ of the form $[\underset{\bar{}}{p}, 1]$. Then we get an interval for $\lambda$ by solving $\log p = -\lambda T$.

We get the confidence interval for $p$ by solving $$ P(X=n) = p^n \ge 0.95 ~~~~\text{(say)} $$ so that $ n\log p \ge \log 0.95 $. This give finally the confidence interval for $\lambda$: $$ \lambda \le \frac{-\log 0.95}{n T}. $$

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
  • 1
    Reading the question and answer I thought "What if _all_ observations are of the second type, for which we only know that $x_j > t_j$, and no observation was fully observed?" It would be really useful to include this case also to your answer, as an extension. – Alecos Papadopoulos Jan 14 '15 at 12:30
  • 2
    Is it not a problem to multiply a probability density by a probability? I think of these things as having two different units. Could you explain how this came to be? – matmat Nov 20 '21 at 07:45
  • @matmat: No, it is not. This is explained at https://stats.stackexchange.com/questions/354671/fitting-distributions-on-censored-data/354808#354808 and probably other places ... – kjetil b halvorsen Mar 01 '22 at 16:57