3

Wikipedia gives the following proof why to use Bessel's correction for the unbiased sample variance:

\begin{align} E[\sigma_y^2] & = E\left[ \frac 1n \sum_{i=1}^n \left(y_i - \frac 1n \sum_{j=1}^n y_j \right)^2 \right] \\ & = \frac 1n \sum_{i=1}^n E\left[ y_i^2 - \frac 2n y_i \sum_{j=1}^n y_j + \frac{1}{n^2} \sum_{j=1}^n y_j \sum_{k=1}^n y_k \right] \\ & = \frac 1n \sum_{i=1}^n \left[ \frac{n-2}{n} E[y_i^2] - \frac 2n \sum_{j \neq i} E[y_i y_j] + \frac{1}{n^2} \sum_{j=1}^n \sum_{k \neq j}^n E[y_j y_k] +\frac{1}{n^2} \sum_{j=1}^n E[y_j^2] \right] \\ & = \frac 1n \sum_{i=1}^n \left[ \frac{n-2}{n} (\sigma^2+\mu^2) - \frac 2n (n-1) \mu^2 + \frac{1}{n^2} n (n-1) \mu^2 + \frac 1n (\sigma^2+\mu^2) \right] \\ & = \frac{n-1}{n} \sigma^2. \end{align}

The proof is clear so far. The only part that I don't understand is the following identity which is used in the penultimate step:

\begin{align} &\sum_{j \neq i} E[y_i y_j] = (n-1) \mu^2\\ \end{align}

This would only make sense if $y_i$ and $y_j$ were independent - but they are not because $i$ has to be unequal to $j$!

To give the simplest possible example: A coin toss which gives $-1$ for heads and $1$ for tails. When you take two independent coin tosses and multiply the results the expected value is indeed $0^2=0$. But if you are only allowed to take the the opposite coin toss as the other result (so you have to take $1$ if $-1$ and $-1$ if $1$) your expectation becomes clearly unequal to $0$ and therefore $\mu^2$ cannot be right!

My question
Could you please explain the identity and where my potential fallacy lies?

amoeba
  • 93,463
  • 28
  • 275
  • 317
vonjd
  • 5,886
  • 4
  • 47
  • 59
  • 3
    @Downvoter: It is good practice here to state your reasons and/or give advice on how to improve the question. Thank you! – vonjd Apr 12 '15 at 19:36

1 Answers1

7

The independence of $Y_i$ and $Y_j$ (whenever $i \neq j$) is an assumption, i.e. you are assuming you are dealing with an i.i.d. sample $Y_1, Y_2, \ldots, Y_n$ from some distribution -- and you are trying to estimate that distribution's variance.

When the Ys are iid, $\mathbb{E}\left[Y_1\;Y_2\right]=\mu^2$, while $\mathbb{E}\left[Y_1\;Y_1\right]=\mathbb{E}\left[Y_1^2\right]=\mu^2 + \mathbb{V}\left[Y\right]$. That's why the $i \neq j$ condition matters for the value of $\mathbb{E}\left[Y_i\;Y_j\right]$.

The example you give at the bottom of your question is a poor analogy: taking "the opposite coin toss" would indeed induce correlation between $X_1$ (the first coin toss) and $X_2$ (the second coin toss, if we defined $X_2 \;|\; X_1$ to be "the opposite" of $X_1$). But that is not at all the meaning of $i \neq j$ in the context of the Ys.

Adrian
  • 3,754
  • 1
  • 18
  • 31