Suppose I have a relatively large number of samples (~1k) drawn from a series (~40) of increasingly long-tailed distributions (going from approximately normal to approximately log-normal). I want to estimate the mean and its uncertainty for these distributions, which I do using jackknife resampling because the samples are highly correlated between adjacent distributions in the series. However, as the distributions become increasingly long-tailed, the estimated uncertainty of the mean becomes so large that the data are useless.
I might think to apply robust estimators, but my review of the literature seems to indicate that these methods assume that the "outliers" are erroneous data, drawn from some other distribution than the one of interest. For this case, they are samples from the true distribution that appear to be outlying merely because the region in which they lie is sparsely sampled.
- Are the usual robust estimators valid for this case? If so, how should they be applied if an outlier in one distribution does not correspond to outliers in other distributions?
- If not, are there other suitable methods? I experimented with power transformations, but could not work out how they could be used while retaining the correlations, since the transformation parameters would differ between distributions.