0

Say we have a univariate Gaussian distribution $p(x)=\mathcal N(x|\mu, \sigma^{2})$. Then suppose we have a data set of observations $X = \{x_1, x_2, \cdots ,x_N\}^{T}$ that are drawn independently from this distribution.

By performing MLE, which means by maximizing $$\ln\prod_{n = 1}^{N} \mathcal N(x_n|\mu, \sigma^{2})$$ with respect to $\mu$ and $\sigma$ respectively, I can get the maximum likelihood solution for $\sigma$, which is $$\sigma_{ML} = \frac{1}{N}\sum_{n = 1}^{N}(x_n-\mu_{ML})^{2}$$

Clearly, in front of the summation, we have a $\frac{1}{N}$. However, I also read that the maximum likelihood solution for $\sigma$ is $$\sigma_{ML} = \frac{1}{N-1}\sum_{n = 1}^{N}(x_n-\mu_{ML})^{2}$$, in which $N$ still denotes for the size of the dataset.

Why are the two solutions different? How do we have $N-1$ on the denominator?

meTchaikovsky
  • 1,414
  • 1
  • 9
  • 23

1 Answers1

1

While $\frac{1}{N}\sum_{n = 1}^{N}(x_n-\bar x)^{2}$ is the maximum likelihood estimator, it is biased and using $\frac{1}{N-1}$ instead corrects this bias. So the first equation (a.k.a. population variance) is the maximum likelihood estimator, while the second one (a.k.a. sample variance) is the bias-corrected one.

Tim
  • 108,699
  • 20
  • 212
  • 390