17

For the normal distribution, there is an unbiased estimator of the standard deviation given by:

$$\hat{\sigma}_\text{unbiased} = \frac{\Gamma(\frac{n-1}{2})}{\Gamma(\frac{n}{2})} \sqrt{\frac{1}{2}\sum_{k=1}^n(x_i-\bar{x})^2}$$

The reason this result is not so well known seems to be that it is largely a curio rather than a matter of any great import. The proof is covered on this thread; it takes advantage of a key property of the normal distribution:

$$ \frac{1}{\sigma^2} \sum_{k=1}^n(x_i-\bar{x})^2 \sim \chi^{2}_{n-1} $$

From there, with a bit of work, it is possible to take the expectation $\mathbb{E}\left( \sqrt{\sum_{k=1}^n(x_i-\bar{x})^2} \right)$, and by identifying this answer as a multiple of $\sigma$, we can deduce the result for $\hat{\sigma}_\text{unbiased}$.

This leaves me curious which other distributions have a closed-form unbiased estimator of the standard deviation. Unlike with the unbiased estimator of the variance, this is clearly distribution-specific. Moreover, it would not be straightforward to adapt the proof to find estimators for other distributions.

The skew-normal distributions have some nice distributional properties for their quadratic forms, which the normal distribution property we used is effectively a special case of (since the normal is a special type of skew-normal) so perhaps it would not be so hard to extend this method to them. But for other distibutions it would appear an entirely different approach is required.

Are there any other distributions for which such estimators are known?

Silverfish
  • 20,678
  • 23
  • 92
  • 180
  • 1
    If you ignore technical distractions, the nature of the answer becomes clearer. In the Normal case little of what you write is really relevant to the conclusion; all that matters is that the amount of bias in this particular estimator is a function of $n$ alone (and does not depend on other distributional parameters that need to be estimated from the data). – whuber Jan 05 '15 at 23:49
  • @whuber I think I can see the general idea you're hinting at, and clearly "function of $n$ alone" is necessary. But I don't think it would be sufficient - if we didn't have access to some nice distributional results, then I can't see how the "closed form" aspect would be tractable. – Silverfish Jan 05 '15 at 23:56
  • 5
    It depends on what you mean by "closed form." For instance, to one person a theta function may be "closed" but to another it's just an infinite product, power series, or complex integral. Come to think of it, that's precisely what a Gamma function is :-). – whuber Jan 06 '15 at 00:00
  • @whuber Good point! By "the amount of bias in this particular estimator", I take it you mean that the bias in $s$ (rather than the estimator listed in the question, which has zero bias) is a function of $n$ (and also in $\sigma$, but fortunately in such a way that we can easily rearrange to find an unbiased estimator)? – Silverfish Jan 06 '15 at 00:13
  • Are you after a characterization, or a partial list of common cases? – Glen_b Jan 06 '15 at 00:55
  • @Glen_b Obviously a characerisaton would be particularly interesting, but asking for one would be rather ambitious (unless the general result is a fairly well-known one that I just haven't seen before). If there are a couple of well-known examples, that would be illuminating enough even if they don't suggest a general method. I'm afraid the question is purposefully vague because I was unsure what results are available, so I will happily take whatever gets given! (The vagueness about whether other parameters should be treated as known or unknown is also deliberate.) – Silverfish Jan 06 '15 at 01:07
  • 1
    @whuber: There should be a similar formula for any location-scale family, with the caveat you pointed out that the function of $n$ may be an intractable integral. – Xi'an Jan 06 '15 at 06:15

2 Answers2

10

A probably well known case, but a case nevertheless.
Consider a continuous uniform distribution $U(0,\theta)$. Given an i.i.d. sample, the maximum order statistic, $X_{(n)}$ has expected value

$$E(X_{(n)}) = \frac {n}{n+1}\theta $$

The standard deviation of the distribution is

$$\sigma = \frac {\theta}{2\sqrt 3}$$

So the estimator $$\hat \sigma = \frac 1{2\sqrt 3}\frac {n+1}{n}X_{(n)}$$

is evidently unbiased for $\sigma$.

This generalizes to the case where the lower bound of the distribution is also unknown, since we can have an unbiased estimator for the Range, and then the standard deviation is again a linear function of the Range (as is essentially above also).

This exemplifies @whuber's comment, that "the amount of bias is a function of $n$ alone" (plus possibly any known constants) -so it can be deterministically corrected. And this is the case here.

Alecos Papadopoulos
  • 52,923
  • 5
  • 131
  • 241
  • 4
    Now the hard part: when in the world are we interested in the standard deviation of a uniform distribution? (+1) – shadowtalker Jan 06 '15 at 16:40
  • 1
    @ssdecontrol That's an excellent question! -please proceed to the next one... – Alecos Papadopoulos Jan 06 '15 at 17:10
  • 2
    One thing I love about this answer is how poor the estimator is. It's quite common to see a question which boils down to "why do we use $\hat{\theta}$ as an estimator even though it's biased?" Some students need convincing that unbiasedness is not the be-all and end-all, and a poor unbiased estimator is one way to show them. – Silverfish Jan 09 '15 at 17:14
  • 1
    @Silverfish Poor in what way? Some quick simulations show this to have lower MSE than the usual standard deviation (which surprised me). – Dave Sep 27 '19 at 16:47
  • @Dave Interesting! I had jumped to the conclusion it would be poor since it only looked at the maximum order statistic, but I too stand surprised! Shows the value of doing some simulation... – Silverfish Nov 14 '19 at 19:03
10

Although this is not directly connected to the question, there is a 1968 paper by Peter Bickel and Erich Lehmann that states that, for a convex family of distributions $F$, there exists an unbiased estimator of a functional $q(F)$ (for a sample size $n$ large enough) if and only if $q(\alpha F+(1-\alpha)G)$ is a polynomial in $0\le \alpha\le 1$. This theorem does not apply to the problem here because the collection of Gaussian distributions is not convex (a mixture of Gaussians is not a Gaussian).

An extension of the result in the question is that any power $\sigma^\alpha$ of the standard deviation can be unbiasedly estimated, provided there are enough observations when $\alpha<0$. This follows from the result $$\frac{1}{\sigma^2} \sum_{k=1}^n(x_i-\bar{x})^2 \sim \chi^{2}_{n-1}$$ that $\sigma$ is the scale (and unique) parameter for $\sum_{k=1}^n(x_i-\bar{x})^2$.

This normal setting can then be extended to any location-scale family $$X_1,\ldots,X_n\stackrel{\text{iid}}{\sim} \tau^{-1}f(\tau^{-1}\{x-\mu\})$$ with a finite variance $\sigma^2$. Indeed,

  1. the variance $$\text{var}_{\mu,\tau}(X)=\mathbb{E}_{\mu,\tau}[(X-\mu)^2]=\tau^2\mathbb{E}_{0,1}[X^2]$$ is only a function of $\tau$;
  2. the sum of squares \begin{align*}\mathbb{E}_{\mu,\tau}\left[\sum_{k=1}^n(X_i-\bar{X})^2\right]&=\tau^2\mathbb{E}_{\mu,\tau}\left[\sum_{k=1}^n\tau^{-2}(X_i-\mu-\bar{X}+\mu)^2\right]\\ &=\tau^2\mathbb{E}_{0,1}\left[\sum_{k=1}^n(X_i-\bar{X})^2\right]\end{align*} has an expectation of the form $\tau^2\psi(n)$;
  3. and similarly for any power $$\mathbb{E}_{\mu,\tau}\left[\left\{\sum_{k=1}^n(X_i-\bar{X})^2\right\}^\alpha\right]=\tau^{2\alpha}\mathbb{E}_{0,1}\left[\left\{\sum_{k=1}^n(X_i-\bar{X})^2\right\}^\alpha\right]$$ such that the expectation is finite.
Xi'an
  • 90,397
  • 9
  • 157
  • 575