Comparing two estimators

Question

Suppose I have two estimators, one is unbiased and another one is biased. But the biased one has smaller MSE(Mean Squared Error) than the unbiased one.

Can we figure out the better one in this case? If yes, then which one is the better estimator and why?

Better in what sense? MSE is a typical evaluation metric, so you have your answer if you want to judge based on MSE. — Dave, Oct 23 '20 at 18:52
"Best" has meaning only in terms of the criterion. If you prize unbiasedness over all else, then you will not favor a biased estimator with a smaller MSE, See my answer for an 'established' case of this. — BruceET, Oct 23 '20 at 20:05

BruceET · Answer 1 · 2020-10-25T02:59:18.190

Suppose you have a random sample with $n = 5$ observations from a normal distribution with unknown $\mu$ and $\sigma^2.$ In estimating $\sigma^2,$ the usual sample variance $V_1 = \frac{1}{n-1}\sum_{i=1}^n(X_i-\bar X)^2$ is unbiased for $\sigma^2:$ $E(V_1) = \sigma^2.$

By contrast, the maximum likelihood estimator of $\sigma^2,$ which is
$V_0 = \frac{1}{n}\sum_{i=1}^n(X_i-\bar X)^2,$ is biased, but has smaller MSE. [This is true for any $n,$ but I choose $n=5$ so that the bias of $V_0$ (negligible for large and moderate $n)$ will be unmistakable in my simulation.]

set.seed(2020)
m = 10^6;  n = 5;  mu = 100;  sg = 10
v1 = replicate(m, var(rnorm(n,mu,sg)))
v0 = (n-1)*v1/n 
mean(v0);  mean(v1)
[1] 79.95946  # aprx E(V0) < 100
[1] 99.94932  # aprx E(V1) = 100
mean((v0-sg^2)^2)
[1] 3606.298  # aprx MSE(V0) < MSE(V1) 
mean((v1-sg^2)^2)
[1] 5007.307  # aprx MSE(V1) = 5000

For $\sigma^2 = 100,$ we have $E(V_0) = 80, E(V_1) = 100.$ Also, $MSE(V_0) = 3200 + 400 = 3600 < MSE(V_1) = Var(V_1) = 5000.$

Histograms of v1 and v0:

par(mfrow = c(2,1))
 hdr1="Unbiased Sample Variance"
 hist(v1, br=30, prob=T, xlim=c(0,800), col="skyblue2", main=hdr1)
  abline(v=100, col="red", lty="dotted")
 hdr2="MLE of Population Variance" 
 hist(v0, br=30, prob=T, xlim=c(0,800), col="skyblue2", main=hdr2)
  abline(v=100, col="red", lty="dotted")
par(mfrow = c(1,1))

Note: A few authors have advocated use of the MLE, bias notwithstanding. However, traditional methods of inference for variances using the chi-squared distribution would have to be altered to use the MLE, and many statisticians believe underestimating $\sigma^2$ is a strong argument against the MLE. (Another complication is that dividing by $n+1$ results in an even greater decrease in MSE.)

Comparing two estimators

1 Answers1

Linked