1

I have simulated failure time data for some components.

I tested whether a Weibull or lognormal distribution best fit the data. Please see Figures below. The figures provide median and $95\%$ prediction intervals in black. The green lines are the non-parametric fit (observations).

It looks like the Weibull distribution fits poorly at the lower tail compared to the lognormal distribution. That is, the Weibull fit overestimates the probability of failing at times lower than (approximately) 45.

I think, from a visual check alone, that the lognormal fit is best.

However, I am interested in the lower tail of the failure time distribution, and I believe the lognormal distribution is too optimistic. For example, a $95\%$ prediction interval for the $0.001\%$ quantile under the lognormal model is $[22.5,30.2]$, compared to $[6.4,15.0]$ under the Weibull model.

My understanding is the both the Weibull and the lognormal distributions are heavy-tailed distributions but the tails of the Weibull fit are "much heavier" in the simulation I have performed.

I have also read (no references at hand, sorry) that the lognormal distribution can provide estimates that are too optimistic (in this case predicting that the $0.001\%$ quantile is much higher than it actually is).

The biggest error in my decision problem would be a component failing when I claim that it will not fail. For example, if a component fails after I state the components are unlikely to fail during a 15 hour mission. Therefore, I would like a conservative estimate for the $0.001\%$ quantile.

These reasons could suggest the Weibull model would be best for my problem. However, I think these estimates are too conservative (and wrong) because the model fits poorly in the region I am most interested in.

Is there a way I can "add more uncertainty" in the lognormal fit? Any other suggestions about my problem?

Weibull Lognormal

JLee
  • 813
  • 3
  • 12
  • 1
    You don't have data down where you claim the lognormal isn't accurate (it's considerably more accurate where you do have data, especially at the low end - but still looks like it would be somewhat conservative). If you have a source of information about it being wrong in the far lower tail - about where these data are silent - you should be incorporating that information as formally as possible rather than in some highly ad hoc manner whose consequences are unclear. – Glen_b Mar 01 '22 at 02:05
  • Thank you for the reply. Which formal ways have you seen to incorporate the information? I could construct an informative prior putting more weight at the lower tail and see how that affects my estimates. Are there any other ways that you have seen? – JLee Mar 01 '22 at 18:25
  • 1
    From the data you have, putting more weight on the lower tail will only push you harder toward the lognormal; your displayed data are much closer to it at the low end. If you have *explicit* information about the lower tail that is outside the data (perhaps expert knowledge, say, or some summary of previous data) you should be able to formalize that, but then neither of these distributions seem like a suitable choice in that case. You might consider broadening your model class (perhaps considering continuous mixtures, for example) and then such a prior might do what you want ... – Glen_b Mar 01 '22 at 22:38
  • 1
    ... while still 'respecting' the data you have – Glen_b Mar 01 '22 at 22:40

0 Answers0