2

Can someone explain the benefits of smoothing the survival function (or equivalently the empirical c.d.f.)? Smoothing leaves out some peaks that might be important. Would one still want to smooth the estimate if there is enough data? Any simple responses or references would be highly welcome.

user90772
  • 681
  • 1
  • 8
  • 19

3 Answers3

3

My two cents: There might be two reasons. One is for ease of estimation (semi-parametric models such as the Cox model or Kaplan-Meier estimate may be difficult to estimate for more extensions, for example when adding more complex random effects), when a spline-based approach may be used.

The second reason is that the doctors would like that. Technically, the Kaplan-Meier is just a discrete-time estimate of the true survival function. This true survival function one would expect to be continuous. For example, it does not really make sense to think that at a certain time point the survival decreases suddenly by some quantity, and that one second before that the individual had a very different survival.

Theodor
  • 1,671
  • 9
  • 14
  • Agree, but maybe you could provide some references on this? – Tim Jul 15 '16 at 20:26
  • It is fairly general knowledge so I do not have references. There are a number of packages that estimate smooth baseline hazards in R. You can find them in the Survival Analysis R task view online, maybe look at their documentation. – Theodor Jul 15 '16 at 23:11
3

First off, I would say that at least 9 out of 10 times, you should prefer the unsmoothed version. And this coming from someone who did work in smoothing survival functions.

With that in mind, I will give two cases where smoothing may be a good idea.

1.) Density or hazard estimation. While the Kaplan Meier curve is a consistent estimator of the survival curve, it is a degenerate estimator of the density and thus the hazard function as well. If you just want survival probabilities, you likely do not directly care about these functions, but there are questions that cannot be answered well without directly modeling the density or hazard (examining if density is multimodal? Current risk for someone conditional on surviving up to a given a age?). I've also run into statistics that require estimates of the density for computing standard errors, so again, density estimation would be required.

2.) Current status data. In this case, the NPMLE (a generalization of the Kaplan Meier curves which allows for interval censored and current status data) is notoriously inefficient. In fact, it has an $n^{1/3}$ convergence rate. It has been shown that by assuming log-concavity of the density function (which effectively smoothes the survival function by making it once differentiable), this increased the convergence rate to $n^{2/5}$ [1]. This can make a nontrivial difference in power.

[1] Anderson-Bergman, C., and Yu, Y, (2015), Computing the Log-Concave NPMLE for Interval Censored Data

But despite that paper, I still think that 9 out of times 10 times the Kaplan Meier curve should be preferred over a smoothed version.

Cliff AB
  • 17,741
  • 1
  • 39
  • 84
  • I understand now what is the issue. Given that I am performing a functional data analysis on multiple KM curves, the curves need to be smooth. A way I found to do this is by smoothing the hazard rate where the estimate is done and then predictions are back-transformed to survival probabilities. Is there any specific reason why the hazard rates are smoothed using specialized methods instead of just smoothing the estimate f(t)/S(t) with simple kernel/spline methods? Maybe this should be another question on its own – user90772 Jul 22 '16 at 10:32
  • @user90772 Without some kind of smoothing of $\hat S(t)$ or smoothing assumption, we don't have any way of getting $\hat f(t)$ since $f(t) = -S'(t)$. If $\hat S(t)$ is obtained from a Kaplan Meier curve, then ${\hat S}'(t)$ is undefined given that $\hat S(t)$ is a step function. – Cliff AB Jul 22 '16 at 18:03
2

As with any other function estimation, you have a tradeoff between bias and variance that smoothing can help to optimize (or at least do better on).

If you don't smooth, you will fail to incorporate relevant "nearby" information, and by leaving out that information your variance will be larger. On the other hand, it will increase bias (such as smoothing off narrows peaks, should they occur).

If you think about it in (say) a mean-square error sense (in this case just for convenience, since the ideas translate to many other measures of fit), there's a "sweet spot" where the total effect of the bias and the variance on mean square error is minimized - if you smooth less your estimate will be noisier, increasing the MSE more than the bias-reduction helps it, and if you smooth more your estimate will be worse because the bias is increasing the MSE more than the variance improvement reduces it.

Glen_b
  • 257,508
  • 32
  • 553
  • 939