4

At Mathematics Stack Exchange, user940 provided a general formula to calculate the variance of the sample variance based on the fourth central moment $\mu_4$ and the population variance $\sigma^2$ (1):

$$\text{Var}(S^2)=\frac{\mu_4}{n}-\frac{\sigma^4(n-3)}{n(n-1)}$$

I would be interested in an unbiased estimator for this, without knowing the population parameters $\mu_4$ and $\sigma^2$, but using the fourth and second sample central moment $m_4$ and $m_2$ (or the unbiased sample variance $S^2=\frac{n}{n-1}m_2$) instead.

Hiro
  • 243
  • 1
  • 10

1 Answers1

9

The question is to find an unbiased estimator of:

$$\text{Var}(S^2)=\frac{\mu_4}{n}-\frac{(n-3)}{n(n-1)} {\mu_2^2}$$

... where $\mu_r$ denotes the $r^\text{th}$ central moment of the population. This requires finding unbiased estimators of $\mu_4$ and of $\mu_2^2$.

An unbiased estimator of $\mu_4$

By defn, an unbiased estimator of the $r^\text{th}$ central moment is the $r^\text{th}$ h-statistic: $$\mathbb{E}[h_r] = \mu_r$$ The $4^\text{th}$ h-statistic is given by:

enter image description here where:

i) I am using the HStatistic function from the mathStatica package for Mathematica

ii) $s_r$ denotes the $r^\text{th}$ power sum $$s_r=\sum _{i=1}^n X_i^r$$

Alternative: The OP asked about finding an unbiased solution in terms of sample central moments $m_r=\frac{1}{n} \sum _{i=1}^n \left(X_i-\bar{X}\right)^r$. An unbiased estimator of $\mu_4$ in terms of $m_i$ is:

enter image description here

An unbiased estimator of $\mu_2^2$

An unbiased estimator of a product of central moments (here, $\mu_2 \times \mu_2$)is known as a polyache (play on poly-h). An unbiased estimator of $\mu_2^2$ is given by:

enter image description here

where:

i) I am using the PolyH function from the mathStatica package for Mathematica

ii) For more detail on polyaches, see section 7.2B of Chapter 7 of Rose and Smith, Mathematical Statistics with Mathematica (am one of the authors), a free download of which is available here.

While mathStatica does not have an automated converter to express PolyH in terms of sample central moments $m_i$ (nice idea), doing that conversion yields:

enter image description here


Putting it all together:

An unbiased estimator of $\frac{\mu_4}{n}-\frac{(n-3)}{n(n-1)} {\mu_2^2}$ is thus:

enter image description here

or, more compactly, in terms of sample central moments $m_i$:

........... enter image description here

And as a check, we can run the expectations operator over the above (the $1^\text{st}$ RawMoment of sol), expressing the solution in terms of Central moments of the population:

enter image description here

... and all is good.

wolfies
  • 6,963
  • 1
  • 22
  • 27
  • Great answer! Thanks! Two things: 1. It would be great if the estimator of $\mu_2^2$ could also be expressed in terms of $m_r$. 2. I'm often encountering terms such as $(n-1)(n-2)(n-3)\ldots$ in the denominator when unbiased quantities are involved. For instance, the unbiased estimator of $\mu_4$ that you have presented has $(n-1)(n-2)(n-3)$ in the denominator. Another example is Bessel's correction with $(n-1)$ in the denominator. Can I generally assume that _any_ unbiased estimator of $\mu_r$ is only defined for $n \geq r$? Is there a theorem for that? – Hiro Oct 13 '17 at 06:51
  • 1
    Yes - the solutions assume the sample size $n \ge w$, where $w$ denotes the _weight_ of the expression. For instance, an expression containing $\mu_2^3$ has weight = 6. It also assumes, of course, that the moments of the parent random variable exist up to that weight. I have updated the answer above to provide the unbiased estimator of $\mu_2^2$ in terms of $m_i$ which is indeed neater. – wolfies Oct 13 '17 at 15:53
  • Thanks for the explanation! However, I think that there is an issue with the unbiased estimator of $\mu_2^2$ in terms of $m_i$. I have checked it in Mathematica by calculating the average of multiple estimators and the result is quite off. It would be great if you could take a look at it again! – Hiro Oct 16 '17 at 07:36
  • 1
    @Hiro - checked and seems fine to me. I have further checked by converting the expression in sample central moments back to power sums which can be done with `expr /. Subscript[m, i_] :> SampleCentralToPowerSum[i][[2]] // Simplify` which returns the expression in power sums above. If you can post your (simulation?) code, I will be happy to look at what you are doing. – wolfies Oct 17 '17 at 15:26
  • I have checked my code and noticed that I've made a mistake that introduced the bias. After correcting it, everything works great! :-) Sorry for that and again, thank you very much! – Hiro Oct 18 '17 at 13:41
  • @wolfies Hello! Just one question: I've run this estimator 100.000 times and some realizations seem to be negative. Is that expected? How can that be explained if it's a variance? Here's a screenshot of my code i.stack.imgur.com/5YEp9.png – Ivan Apr 03 '19 at 15:32