8

I am totally confused: On the one hand you can read all kinds of explanations why you have to divide by n-1 to get an unbiased estimator for the (unknown) population variance (degrees of freedom, not defined for sample size 1 etc.) - see e.g. here or here.

On the other hand when it comes to variance estimation of a supposed normal distribution all of this doesn't seem to be true anymore. There it is said that the maximum likelihood estimator for variance includes only a division by n - see e.g. here.

Now, can anyone please enlighten me why it is true here but not there? I mean normality is what most models boil down to (not least due to the CLT). So is the choice "division by n" yet the best choice for finding the best estimation for the true population variance after all?!?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
vonjd
  • 5,886
  • 4
  • 47
  • 59

3 Answers3

6

The MLE is indeed found through division by n. However, MLE's are not guaranteed to be unbiased. So there is no contradiction in the fact that the unbiased estimator (divided by n-1) is used.

In practice, for reasonable sample sizes, it should not make a big difference anyway.

Nick Sabbe
  • 12,119
  • 2
  • 35
  • 43
  • 5
    ...where practice is large $n$. –  May 05 '11 at 09:07
  • So does that mean that the unbiased estimator is n-1 *also* under normality? – vonjd May 05 '11 at 11:07
  • See e.g. [wikipedia](http://en.wikipedia.org/wiki/Variance#Population_variance_and_sample_variance) for a proof that this is the case regardless of distribution. So, in short (again): yes. – Nick Sabbe May 05 '11 at 11:19
6

The answer to your question is contained within your question.

When choosing an estimator for a parameter, you should ask yourself, what property would you like your estimator to have:

  • Robustness
  • Unbiasedness
  • Have the distributional properties of a MLE
  • Consistency
  • Asymptotically normal
  • You know the population mean, but the variance is unknown

If your estimator is the one that is divided by (n-1), then you want an unbiased estimtor of the variance. If your estimator is the one that is divided by n, then you have an MLE estimator. Of course, when n is large; dividing by either (n-1) or n will give you approximately the same results and the estimator will be approximately unbiased and have the properties of all MLE estimators.

schenectady
  • 1,079
  • 7
  • 11
1

one is unbiased estimator

one is maximum likelihood estimator

they are not contradictory, just serving different objectives

if you think about any distribution, the maximum point in the likelihood function is not necessarily the mean value

gdlamp
  • 121
  • 4