7

In a simulation study, is there any difference between

$\bullet$ estimating the variance $\sigma^2$, $1000$ times and taking its average, and

$\bullet$ estimating the standard deviation $\sigma$, $1000$ times and taking its average?

Can I do anyone of these? Is there any preference of doing a particular one?

Xi'an
  • 90,397
  • 9
  • 157
  • 575
user81411
  • 731
  • 1
  • 7
  • 14
  • 3
    Clearly there's *some* differences because the variance and the standard deviation are not the same. Can you be more specific about what you're after? – Glen_b Feb 19 '17 at 10:08
  • We prefer the variance because the formula for variance is unbiased for any underlying distribution. You may find the answers to your question on this page https://stats.stackexchange.com/questions/249688/why-are-we-using-a-biased-and-misleading-standard-deviation-formula-for-sigma – Hugh Feb 19 '17 at 10:10
  • 2
    @hugh are you sure unbiasedness should be the only criterion? – Glen_b Feb 19 '17 at 10:11
  • @Glen_b In this link http://bmcmedresmethodol.biomedcentral.com/articles/10.1186/1471-2288-7-34 (Table 1), I am not understanding why did authors estimate $\sigma_0$, $\sigma_1$ instead of $\sigma_0^2$, $\sigma_1^2$? – user81411 Feb 19 '17 at 10:20
  • Also http://www.joophox.net/publist/methodology05.pdf, authors estimated $\sigma.$ – user81411 Feb 19 '17 at 10:22
  • You would take the root of the average of the variances and want to know if you should rather simulate the standard deviation directly, right? – Horst Grünbusch Feb 25 '17 at 15:59
  • @HorstGrünbusch No, rather I would take root of the estimated variance and average all estimated standard deviation (i.e., standard error). – user81411 Feb 26 '17 at 02:23

1 Answers1

5

I find this question of interest because it highlights the artificial nature of seeking unbiasedness above everything else. A few points:

  • the variance $\sigma^2$ allows for an unbiased estimator, while the square root of that estimator $\hat\sigma_n$ is biased [by Jensen's inequality];

  • there is no generic unbiased estimator of $\sigma$ [generic meaning across all distributions];

  • for a scale or location-scale family of distributions, since $\sigma$ is a scale, the expectation $\mathbb{E}^P[\hat\sigma_n]$ can be written as $$\mathbb{E}^P[\hat\sigma]=c(P,n)\sigma$$ where $n$ is the sample size and $P$ is the family of distributions. Hence bias can be corrected family-wise

Xi'an
  • 90,397
  • 9
  • 157
  • 575
  • I guess the real reason one often is pooling variance (and not standard deviation) is that (with normal distributed data) it leads to cleaner distribution theory (F-dist). – kjetil b halvorsen Feb 25 '17 at 15:40