4

I'm trying to understand the theory of estimators. As I understand it now, if you have an r.v. $X$ and take $n$ i.i.d. samples then an estimator for $E[X^{2}]$ would be $\overline{X^{2}}$ since $E[\overline{X^{2}}] = E[X^{2}]$ (probably only true for some kind of "nice" r.v.).

However, the same kind of nice result doesn't occur when trying to estimate $E[X]^{2}$. That is to say, the function $\overline{X}^{2}$ does not estimate this. But I'm not sure I understand which function does estimate it. I have the equation $V[\overline{X}] = E[\overline{X}^{2}]-E[\overline{X}]^{2}$ and so $E[X]^{2} = E[\overline{X}^{2}]-V[\overline{X}]$. This seems relevant but I'm not sure what to conclude from this.

Sycorax
  • 76,417
  • 20
  • 189
  • 313
Addem
  • 479
  • 2
  • 11
  • Actually the natural estimator for $E \left[ X^2 \right]$ is $\frac{1}{n} \sum_{i=1}^n X_i ^2$. Can you show that it is unbiased for the case of iid samples? – JohnK Oct 22 '14 at 14:53
  • 1
    @JohnK I believe that what the OP means by "$\overline{X^{2}}$" is precisely $\frac{1}{n} \sum_{i=1}^n X_i ^2$. – whuber Oct 22 '14 at 15:02
  • 4
    Addem, *any* statistic will estimate $E[X]^2$. The right question to ask is *how well will it do*. The answer depends on how you measure the goodness of an estimator. – whuber Oct 22 '14 at 15:04
  • 1
    @whuber thats a great point, good way to step back and look at the bigger picture. – bdeonovic Oct 22 '14 at 16:56

2 Answers2

7

Background: unbiased estimators of products of population moments

If you desire an UNBIASED estimator of a (product of moments), there are 3 varieties:

  1. Polykays (a generalisation of k-statistics): these are unbiased estimators of products of population cumulants. The term polykay was coined by Tukey, but the concept goes back to Dressel (1940).

  2. Polyaches (a generalisation of h-statistics): these are unbiased estimators of products of population central moments. i.e.

$$E\left[\text{h}_{\{r,t,\ldots ,v\}}\right] = {\mu }_r {\mu }_t \cdots {\mu }_v\text{$\, $}$$ ...... where ${\mu }_r$ denotes the $r^{th}$ central moment of the population.

  1. Polyraws: these are unbiased estimators of products of population raw moments. That is, you wish to find the $polyraw_{r, t, ...v}$ such that:

$$E\left[\text{polyraw}_{\{r,t,\ldots ,v\}}\right] = \acute{\mu }_r \acute{\mu }_t \cdots \acute{\mu }_v\text{$\, $}$$

...... where $\acute{\mu }_r$ denotes the $r^{th}$ raw moment of the population.


The Problem

We are given a random sample $(X_1, X_2, \dots, X_n)$ drawn on parent random variable $X$.

If we desire an unbiased estimator of: $(E[X])^2 = \acute{\mu }_1 \acute{\mu }_1$, then an unbiased estimator is the {1,1} polyraw:

enter image description here

where $s_r = \sum_{i=1}^n X_i^r$ denotes the $r^{th}$ power sum.


Comparison

Benjamin proposed the estimator: $\bar{X}^2 = (\frac{s_1}{n})^2$. This is not an unbiased estimator, since $E[(\frac{s_1}{n})^2]$ is just the $1^{st}$ RawMoment of $(\frac{s_1}{n})^2$:

enter image description here

which is not equal to $\acute{\mu }_1^2$.

Let us check the polyraw solution:

enter image description here

... which is an unbiased estimator.

Plainly, unbiasedness is not everything, and we could equally calculate, for example, the MSE (mean-squared error) of each estimator using exactly the same tools.

[Update: Just had a quick play with this: in a simple test case of $X \sim N(0,\sigma^2)$, the polyraw unbiased estimator has smaller MSE than Ben's ML estimator, for all sample sizes $n$. That is, at least for the test case of Normality, the polyraw unbiased estimator dominates the maximum likelihood estimator, at all sample sizes. ]

Notes

  • PolyRaw, RawMomentToRaw etc are functions in the mathStatica package for Mathematica

  • I confess to the neologism polyache in our Springer book (2002) (and more recently, to polyraw in the latest edition).

wolfies
  • 6,963
  • 1
  • 22
  • 27
  • 1
    Took me a while to guess you meant to pronounce that "polyache" as 'poly-aitch' (poly-"h"). – Glen_b Oct 22 '14 at 20:33
  • 1
    The lower MSE is very interesting (because simple situations where MLE loses to moments even in fairly small samples don't come along all that often). [This question](http://stats.stackexchange.com/questions/80380/examples-where-method-of-moments-can-beat-maximum-likelihood-in-small-samples) seeks examples where method of moments beats MLE in such a manner. – Glen_b Oct 22 '14 at 21:29
3

By continuous mapping theorem $\bar{X}^2 \to \text{E}[X]^2$ in probability, so I would say it is a good estimate.

Depending on the distribution of $X$, if $\bar{X}$ is the MLE of $\text{E}[X]$, then $\bar{X}^2$ will be the MLE of $\text{E}[X]^2$ (MLE is invariant to transformation).

If $X_i$ are iid and $\text{Var}[X] = \sigma^2$, then $\text{Var}[\bar{X}] = \text{Var}\left[ \dfrac{1}{n} \sum_{i=1}^n X_i\right] = \dfrac{1}{n^2}\sum_{i=1}^n \text{Var}[X_i] = \dfrac{\sigma^2}{n}$

bdeonovic
  • 8,507
  • 1
  • 24
  • 49