Wikipedia gives the following proof why to use Bessel's correction for the unbiased sample variance:
\begin{align} E[\sigma_y^2] & = E\left[ \frac 1n \sum_{i=1}^n \left(y_i - \frac 1n \sum_{j=1}^n y_j \right)^2 \right] \\ & = \frac 1n \sum_{i=1}^n E\left[ y_i^2 - \frac 2n y_i \sum_{j=1}^n y_j + \frac{1}{n^2} \sum_{j=1}^n y_j \sum_{k=1}^n y_k \right] \\ & = \frac 1n \sum_{i=1}^n \left[ \frac{n-2}{n} E[y_i^2] - \frac 2n \sum_{j \neq i} E[y_i y_j] + \frac{1}{n^2} \sum_{j=1}^n \sum_{k \neq j}^n E[y_j y_k] +\frac{1}{n^2} \sum_{j=1}^n E[y_j^2] \right] \\ & = \frac 1n \sum_{i=1}^n \left[ \frac{n-2}{n} (\sigma^2+\mu^2) - \frac 2n (n-1) \mu^2 + \frac{1}{n^2} n (n-1) \mu^2 + \frac 1n (\sigma^2+\mu^2) \right] \\ & = \frac{n-1}{n} \sigma^2. \end{align}The proof is clear so far. The only part that I don't understand is the following identity which is used in the penultimate step:
\begin{align} &\sum_{j \neq i} E[y_i y_j] = (n-1) \mu^2\\ \end{align}This would only make sense if $y_i$ and $y_j$ were independent - but they are not because $i$ has to be unequal to $j$!
To give the simplest possible example: A coin toss which gives $-1$ for heads and $1$ for tails. When you take two independent coin tosses and multiply the results the expected value is indeed $0^2=0$. But if you are only allowed to take the the opposite coin toss as the other result (so you have to take $1$ if $-1$ and $-1$ if $1$) your expectation becomes clearly unequal to $0$ and therefore $\mu^2$ cannot be right!
My question
Could you please explain the identity and where my potential fallacy lies?