Generate tail of distribution by a given sample in R

Question

I have a sample of measurements from a real life device which misses all the measurements that are less than some threshold (given device is not precise enough).

From theory and also measurements from more precise devices I know that the data distributed very closely to Gumbel distribution.

I found parameters of a Gumbel distribution for a given sample with MLE and now I want to generate "perfect" missing values and add them to the sample.

I am not sure what is the easiest way to do that in R. I have my sample and sample size, obviously. I want to create a copy of it and then add some values using theoretical PDF.

Do you know what the threshold is, or do you also want to estimate that from your data? — P.Windridge, Nov 29 '15 at 12:29

score 4 · Accepted Answer · answered Dec 04 '15 at 10:23

I'm assuming that the values are truncated below the threshold, $t$ rather than censored below $t$ (that is, you don't know how many there are below the threshold).

Let the number of points observed above the truncation point be $n_o$.

A simple approach could go as follows:

Estimate $p$, the proportion of the distribution below the threshold from the fitted parameters and the truncation point (this ignores the uncertainty in the parameter estimates; however if your sample is large that may not be a terrible approximation)
Simulate $N_u$, the number of missing values from the negative binomial $NB(n_o,p)$, obtaining the specific value $n_u$.
Simulate $n_u$ values from the truncated Gumbel on (0,$t$); for example you might consider some form of accept-reject.

Generate tail of distribution by a given sample in R

1 Answers1