4

What is the minimum-variance unbiased estimator to estimate quantiles when the errors are normal distributed?

  • median

    When we wish to estimate the median, $\mu$, of a normal distributed variable then the sample mean (an efficient estimator of $\mu$) performs better than the sample median. The sample mean has a lower variance than the sample median (and is in fact the minimum-variance unbiased estimator of the median $\mu$).

  • But what about other quantiles?

    My intuiton say that we can view this as estimating the value of $\mu+k\sigma$ for some given value of $k$, and then use the unbiased estimator $\hat \mu + k \hat \sigma$ based on the two (sufficient) statistics below. Is this also the minimum variance unbiased estimator?

    $$\hat \mu = \bar{x} \quad \text{ and } \quad \hat \sigma = c_n s = c_n \sqrt{\frac{1}{n-1}\sum_{i=1}^n (x_i-\bar{x})^2}$$

    where $c_n$ is a correction factor to make $\hat \sigma$ unbiased.

    Or is there possibly some other statistic, e.g. a combination of two sample quantiles or some version of minimizing the sum of absolute residuals, that could perform better (better as in, unbiased and lower variance)?

Sextus Empiricus
  • 43,080
  • 1
  • 72
  • 161
  • I vaguely remember it being a duplicate. I believe that a question about the efficiency of the estimator $\bar{x} + k s$ to estimate $\mu + k \sigma$ has occurred before. But I could not find it easily, so I thought that this question, when it is a duplicate after all, may at least be a good pointer for searchers of the same question. – Sextus Empiricus Mar 12 '19 at 17:19
  • Could you please state what you think $k$ should be more specifically? After all, the unbiasedness criterion immediately implies $E[m+ks]=\mu+k\sigma\sqrt{2/(n-1)}\Gamma(n/2)/\Gamma((n-1)/2),$ uniquely determining the value of $k.$ – whuber Mar 12 '19 at 18:03
  • $k$ is a fixed parameter. It is set according to the quantile of the normal distribution that one wishes to estimate. E.g. if one wishes to estimate the 95th quantile of a population that is normal distributed, then one is indirectly estimating the value of $\mu + 1.645 \sigma$. I wonder whether $\bar{x} + 1.645 s$ is an efficient unbiased estimator of $\mu + 1.645 \sigma$ (and in general the same question for other values of $k$). – Sextus Empiricus Mar 12 '19 at 18:45
  • Would I be correct, then, in interpreting your question as asking whether it is the case that $$\sqrt{2/(n-1)}\Gamma(n/2)/\Gamma((n-1)/2)=1$$ for any (or even one) integral value of $n$? – whuber Mar 12 '19 at 19:24
  • To be honest, I do not really understand where the expression $\sqrt{2/(n-1)}\Gamma(n/2)/\Gamma((n-1)/2)$ comes from. What I am asking is whether quantiles of a population can be estimated efficiently using the mean and variance of a sample taken from that population, when the population has a normal distribution. I know that this is the case when the quantile to be estimated is the median (which is a simple case because it is basically estimating the distribution parameter $\mu$), I wonder whether this is also true when the quantile that is to be estimated is not the median. – Sextus Empiricus Mar 12 '19 at 19:49
  • I took expectations! (See the first comment.) Because you're requiring the estimator be biased, this is the very first thing to check. The point is that the requirement to be unbiased uniquely determines $k.$ That's doesn't seem to leave much to ask about. – whuber Mar 12 '19 at 19:52
  • aha, I see, I was under the impression that E[m + ks] = E[m] + k E[s], I will look into that. But, in any case, when I made that assumption erroneously, it is not the idea of my question. – Sextus Empiricus Mar 12 '19 at 19:56
  • Your impression is correct: expectation is linear. What do you suppose the expectation of the sample standard deviation might be? (I haven't checked my formula, but certainly the expectation is not equal to $\sigma.$) – whuber Mar 12 '19 at 19:59
  • 1
    I see now that I had mistaken $\frac{1}{n}$ as the correction for $\sigma^2$, *as well as* $\sigma$. – Sextus Empiricus Mar 12 '19 at 20:05
  • Re the edit: the question is now in a form I expected to see when it first appeared, so the time is ripe to post the comment I originally had in mind: exactly in what sense do you mean "better"? Obviously alternative procedures will be biased, so what loss function (or other quantitative objective) would you have in mind? – whuber Mar 12 '19 at 20:13
  • @whuber, I was thinking of the *minimum variance* unbiased estimator. – Sextus Empiricus Mar 12 '19 at 20:49
  • 1
    The duplicate I was thinking of was https://stats.stackexchange.com/questions/382124/x-1-x-2-dots-x-n-overset-textiid-sim-n-mu-sigma2-derive-a-confide but it is not the same. – Sextus Empiricus Mar 12 '19 at 21:15
  • The near-but-non-duplicate is a good reference, though, because it concerns a closely related problem: estimation of a quantile is a form of tolerance limit. (That's why I asked the question about what you mean by "better," because the different kinds of tolerance limit differ about that.) – whuber Mar 12 '19 at 21:19

0 Answers0