Is MLE more efficient than Moment method?

Question

I have got some small data sets (about 8 to 11 data points for each set), following Normal distribution. I would like to find out the 95% confidence interval of the 0.005 and 0.995 percentile of each set.

Firstly, moment estimation method is employed to estimate the Normal distribution parameters, and their confidence interval is built by (mu~Normal, sigma^2~Chi-square) theorem. And find the CI of percentile by simulation.

Secondly, MLE method is also employed and the parameter's CI is built by MLE~asymptotic Normal theorem. Then find the CI of percentile by simulation.

As the figure shows, the MLE CI is much narrow than Moment method. We know that MLE is efficient, leading small variance and narrow CI. This understanding is consistent with our figure.

But my MLE CI approach is based on asymptotic assumption, while my amount of data points is quite small. Would this (too small data amount) leads MLE's CI incorrect and worse than moment method? or it is still more efficient than moment method?

Is the MLE CI too narrow to contain the 95% probability of the true value, if the amount is too small?

enter image description here

1) Why do you think your small data sets follow a Normal distribution? 2) The moment estimator and the MLE are the same for the mean, and are the same for the std. dev. if using $n$ as the divisor in the sample variance calculation. So your results should be the same regardless of which you use. Even if using $n-1$ as the divisor, the difference will be far smaller than your plots indicate. I strongly suspect a programming error or conceptual mistake... could you clarify a little more just what assumptions you made when building the CIs? — jbowman, Nov 22 '13 at 14:33
@jbowman Thanks for your comment. 1)The assumption is by some other researchers' previous papers for the data set. In my current project, I just follow his conclusion. 2)I agree with you that the estimation of parameter's value is the same (mu and sigma), which are the same in my work indeed. But the error of the estimation is not the same, leading the CI width different. In figure, the point estimates are the same (small difference because of rounding with different software). The main difference is the width of CI. Just wonder whether CI of MLE is wide enougth if estimated by samll data set — Shusheng, Nov 22 '13 at 15:06
If the estimate of the parameter's value is the same for two different methods, it must be that the errors of estimation are the same as well. How could it be different? — jbowman, Nov 22 '13 at 15:42
@jbowman The point estimate of parameter's value is the same. ML can estimate the error of the estimate by Fisher's score, while MoM seems not to have such property. So in MoM, I find the error of estimate by the underlying distribution assumption (data follows Normal). Then mu follows Normal by CLM, sample variance follows a transformed Chi-square. This is the reason why the CI of the parameters are different by different error of estimation. Thanks. — Shusheng, Nov 22 '13 at 16:34
What you are really doing is comparing the results of two different ways of calculating the CI, not two different ways of estimating the parameters. I want to emphasize: the distributions of the estimators is the same, because the estimators themselves are the same. Any differences in the CI cannot be due to "different error(s) of estimation", since the errors of estimation are always the same. — jbowman, Nov 22 '13 at 17:42

score 2 · Answer 1 · answered Nov 22 '13 at 17:18

2

I just wanted to chime in with a story. Last Joint Statistical Meetings, I saw Donald Rubin speak after a few presentations at a causal inference session. He started poking fun at the presenters because their methods were based on inverse probability weighting schemes (resembling the Horvitz-Thompson estimator in sampling theory). Anyway, I'll never forget the quote (paraphrasing):

"Horvitz-Thompson is just glorified Method of Moments. We've known that was inferior to Maximum Likelihood since Fisher in the 40s!"

answered Nov 22 '13 at 17:18

Ben Ogorek

4,629
1
21
41

Thanks for your answer. I agree with you and Rubin that MLE is better (especially with large data set). As here I also prefer MLE but just worry about in small data set case, is MLE still good enough (for its asymptotic property) or still better than MoM? Thanks. – Shusheng Nov 22 '13 at 17:38
Yeah I guess I don't know. Surely method of moments must outperform an MLE in _some_ situation (there are a lot of situations!). Please let me know if you find it. – Ben Ogorek Nov 25 '13 at 00:26
@baogorek Off the top of my head I can't think of any cases where MoM beats MLE in small samples, but there could well be some cases. It might make an interesting question here - to ask if there are any known cases. If nobody else asks it, perhaps I will. – Glen_b Dec 22 '13 at 19:53

score 1 · Answer 2 · answered Nov 22 '13 at 14:12

1

Percentile estimates will not have a normal distribution, even asymptotically. Since you know your data are normal, why not consider a tolerance interval. It will not contain the 99.5 and .05 percentiles, per se, but you can set one up to cover 99% of the possible values with X% confidence (adjustible). If your goal is coverage of possible values, this will be sufficient. However, if you actually want the actual percentiles, then see this paper and this

answered Nov 22 '13 at 14:12

Thank you for your answer. I did not use Normal distribution for the percentile, as it is not Normal indeed. Normal distribution is used for parameter's distribution and then estimate the percentile with simulation – Shusheng Nov 22 '13 at 14:19
Ok, thanks for the clarification. Then consider the above three links. They should help you decide. In general, method of moments is not as efficient as MLE, but there are drawbacks to both, see: http://en.wikipedia.org/wiki/Method_of_moments_(statistics) – Nov 22 '13 at 14:27
Thanks. I agree with you that MoM is not so efficient as MLE. But MLE is based on asymptotic property needing large amount of data points. Just wonder in small data size case, is MLE still more efficient than MoM? And the width of CI by MLE is wide enough in my small size data case? Thanks. – Shusheng Nov 22 '13 at 16:55
@Shusheng I re-read your post in case I was missing something. I would recommend you use the MLE, but not the normal approximation for its CI. You should use a parametric bootstrap for the the CI by fitting a normal to the data (using estimates of mean and variance) and then calculating the upper and lower percentiles of the estimators' bootstrap distribution. – Nov 23 '13 at 00:33

Is MLE more efficient than Moment method?

2 Answers2

Linked