5

Question: Is anyone aware of a consistent, non-parametric estimator of the expected value of an asymmetric distribution that is robust to fat tails? What if we constrain ourselves to the class of continuous (on the real number line) uni-modal distributions?

Note: There are several parametric estimators in the literature; I am specifically interested in non-parametric estimators.

Colin T Bowers
  • 745
  • 6
  • 23
  • what's the problem with the sample mean? – user603 Jan 22 '14 at 11:09
  • 1
    @user603 It's not robust :-) The data I'm working with is (very) fat-tailed, so the sample mean is very noisy, even with lots of observations. The sample median or sample trimmed mean are examples of robust estimators, but given an asymmetric distribution, they do not (necessarily) converge to the expected value. – Colin T Bowers Jan 22 '14 at 13:08
  • 2
    If we talk about fat tailed distributions, then the existence of expected value is a very strong assumption. – Michael M Jan 22 '14 at 16:02
  • @MichaelMayer Very true and to be honest I'm not completely convinced it does exist for the type of data I'm working with. Nonetheless, for the current paper I'm working on I am assuming it exists so as to fit in with the extant literature :-) I might look at other location parameters in a future paper... – Colin T Bowers Jan 23 '14 at 01:09

1 Answers1

1

In the univariate setting I would do like so:

  1. Compute the adjusted whiskers from the adjusted boxplot.
  2. Compute a weighted mean by assigning weight 1 to the observations inside the adjusted whiskers and 0 (for those observations outside the whiskers). This is a form a trimmed mean, but the trimming takes the asymmetry of the good part of the data into account.
user603
  • 21,225
  • 3
  • 71
  • 135
  • 1
    +1 I hadn't come across Hubert and Vandervieren's paper in my literature search. It was an interesting read, thanks for the reference. However, the estimator you propose is not (necessarily) consistent, so I'll hold off on awarding the answer tick for now. It is plausible that the question has no answer, so if there are no other responses over the next few weeks I'll come back and give the tick. Thanks again. – Colin T Bowers Jan 23 '14 at 03:56
  • 1
    I'm interested by your comment. It's not clear to me how an estimate of centrality can by inconsistent. – user603 Jan 23 '14 at 06:54
  • The parameter I am specifically interested in is expected value (ie first moment), and by "consistency" I mean convergence in probability to the first moment. My reading of the Hubert/Vandervieren paper indicates that their method is ad hoc, ie it is a good approximation across a wide variety of distributions, but nonetheless the resulting asymmetric trimmed mean is not guaranteed to converge in probability to the first moment. Thus by my definition it is inconsistent. Unfortunately, for the paper I am writing, the parameter of interest *must* be the first moment (for several reasons). – Colin T Bowers Jan 23 '14 at 10:16
  • 1
    I understand the use of consistency factor to correct the systematic bias resulting from trimming on the second moments. I've never encountered the concept of consistency correction factor for the first moment. Under the conditions of your comment, I wonder if *any* estimator would work at all. On the one hand, you don't know the distribution of the good data, so, if you want to exclude the outliers, you will to do some form of trimming. But by your comment above, this will cause the trimmed mean to be the inconsistent. – user603 Jan 23 '14 at 12:29
  • 2
    "I wonder if *any* estimator would work at all" -I agree with this sentiment. This is why I said in an earlier comment that it is plausible the question has no answer. After all, to obtain consistency for the first moment, we need to say something about the tails of the distribution, but without making a parametric assumption, it is difficult to see how to do this without using the observations from the tails. I was hoping that I was overlooking something clever, but perhaps what I am trying to do is simply impossible without a parametric assumption. – Colin T Bowers Jan 24 '14 at 00:19
  • @ColinTBowers: thanks for having taken the time to respond to my comment. Indeed, it may be that we are overlooking some more clever approach. – user603 Jan 24 '14 at 11:20