Maximum likelihood estimator that is not a function of a sufficient statistic

Question

I always read that every maximum likelihood estimator has to be a function of any sufficient statistic. The idea is that, if we are dealing with a random variable $X$ with mass or density function $f(x\mid\theta)$, and $T$ is a sufficient statistic for $\theta$, then by the factorization theorem $f(\vec{x}\mid\theta)=g(T(\vec{x}),\theta)h(\vec{x})$, so maximizing $f(\vec{x}\mid\theta)$ on $\theta$ means maximizing $g(T(\vec{x})\mid\theta)$ on $\theta$, therefore every maximum likelihood estimator for $\theta$ must be a function of $T(\vec{x})$.

However, I have the following counterexample? for this result:

Let $X\sim\text{Unif}(\theta-1/2,\theta+1/2)$. The likelihood function if $L(\theta\mid\vec{x})=1_{[x_{(n)}-1/2,x_{(1)}+1/2]}$, where $x_{(1)}$ and $x_{(n)}$ are, respectively, the minimum and the maximum of our sample $\vec{x}$ of size $n$. Then, any $\hat{\theta}$ with $x_{(n)}-1/2\leq\hat{\theta}\leq x_{(1)}+1/2$ is a maximum likelihood estimator. Also, note that $(X_{(1)},X_{(n)})$ is a sufficient statistic. Now let $$\hat{\theta}=x_{(n)}-1/2+\frac{|x_j|}{1+|x_j|}(x_{(1)}-x_{(n)}+1),$$ where $x_j\neq x_{(1)}$ and $x_j\neq x_{(n)}$. This $\hat{\theta}$ is a maximum likelihood estimator for $\theta$, but is not a function of $(x_{(1)},x_{(n)})$.

What is wrong?

I'm sorry, but why is this not a function of $(x_{(1)},x_{(n)})$? Your mathematical expression clearly says it is a function! — Landon Carter, Dec 28 '16 at 18:07
It seems that your example deals with the uniqueness of the MLE and not its dependence on the sufficient statistic. — Michael R. Chernick, Dec 28 '16 at 18:14
@LandonCarter $\hat{\theta}$ depends on $x_j$, and $x_j$ does not depend on $x_{(1)}$, $x_{(n)}$, so $\hat{\theta}$ cannot be a function of $(x_{(1)},x_{(n)})$. — user39756, Dec 28 '16 at 19:09
@MichaelChernick I don't understand. In which part of the proof from the first paragraph is it used that the maximum likelihood estimator is unique? — user39756, Dec 28 '16 at 19:15
In English "the" means 1 "a" means possibly more than one. This is technically not in the "proof" but is in the problem statement (proposition). — Michael R. Chernick, Dec 28 '16 at 20:00
In the "proof" you say that any theta in the interval is an MLE. If true then there is a continuum of MLEs. — Michael R. Chernick, Dec 28 '16 at 20:03
@MichaelChernick I edited the question deleting the 'the'. In my example there are a lot of MLE, but why does this contradict the proof that I wrote in the first paragraph? In the first paragraph I am not assuming uniqueness of the MLE. — user39756, Dec 28 '16 at 20:06
What Landon Carter is saying is that x_(1) and x_(n) are sufficient for theta. You appear to be saying that it is the entire interval between them. I think he is right. — Michael R. Chernick, Dec 28 '16 at 20:07
@MichaelChernick When I write $(x_{(1)},x_{(n)})$ I mean a point, not an interval. THe estimator $\hat{\theta}$ cannot be written as $F(x_{(1)},x_{(n)})$, because $x_j$ does not depend on $x_{(1)}$, $x_{(n)}$. But $\hat{\theta}$ is a MLE. This is the contradiction I am asking for. — user39756, Dec 28 '16 at 20:09
Look at it this way. Given x_(1) and x_(n) the other x_i s do not add any information about theta. That is another way to look at sufficiency. The other x_i are ancillary. — Michael R. Chernick, Dec 28 '16 at 20:16
@MichaelChernick Basically, my doubt is the following: it is usually proved that if $\hat{\theta}$ is a MLE and $T$ is sufficient, then $\hat{\theta}=F(T)$. But my $\hat{\theta}$ of the example is a MLE and cannot be written as $F((x_{(1)},x_{(n)}))$. I would like an answer detecting a mathematical error in the proof of the first paragraph or in the example. — user39756, Dec 28 '16 at 20:27
I remember that there are problems with the likelihood function when the parameter defines an endpoint of the distribution . If I recall correctly the mle for n observations taken on the interval [0, theta] is x_(n). In your case theta defines both boundaries. I have not pinpointed an error in your proof. But I do not know for sure that the likelihood function is correct and if correct what is the result of differentiating the likelihood or log likelihood wrt theta? Are there regularity conditions that must be met? — Michael R. Chernick, Dec 28 '16 at 21:12
I have been search for literature on this and finding it hard to see post on this site that answer the question. Wikipedia is helpful but doesn't deal with examples like this. — Michael R. Chernick, Dec 28 '16 at 21:14
I would not be inclined to answer this question unless some authoritative source of the incorrect assertions on which it relies can be cited. In particular, the introductory claim "every maximum likelihood estimator has to be a function of any sufficient statistic" is not true. What is true is that every ML estimator, **if unique**, must be a function of the sufficient statistics. This is an immediate consequence of the definitions of the MLE and sufficient statistic. (See Kendall & Stuart 5th Ed. Vol II section 18.4.) — whuber, Dec 29 '16 at 15:50
@whuber And in the proof of the first paragraph, where is it used the fact that the MLE is unique? — user39756, Dec 29 '16 at 17:27
You erred at the point you wrote "therefore". Your conclusion does not follow unless the maximum value depends uniquely on $T$, for otherwise you are free to select $\hat\theta$ in arbitrary ways--exactly as you do in your example. — whuber, Dec 29 '16 at 19:02
I fail to see how your example is not a function of $x_{(1)}$ and $x_{(n)}$, they are right there in the function definition, and in fact serve to define the endpoints of the line segment in which all the values are maximum likelihood estimates. All your use of $|x_j|$ etc. serves to do is pick one of those values. After all, you are not claiming that all MLEs have to be function solely of the sufficient statistics. — jbowman, Apr 28 '17 at 22:51

score 5 · Answer 1 · answered Sep 20 '18 at 18:36

Nothing is wrong with what you said, just the statement that every maximum likelihood estimator has to be a function of any sufficient statistic, which is false as stated. A more correct form of putting this assertion is:

If $T$ is a sufficient statistic for $\theta$ and a unique MLE of $\hat{\theta}$ exists, then $\hat{\theta}$ must be a function of $T$. If any MLE exists, then an MLE $\hat{\theta}$ can be chosen to be a function of $T$.

This quote is from Maximum Likelihood and Sufficient Statistics found in The American Mathematical Monthly by D.S. Moore. You can find it on JSTOR. You can also find an example similar to yours and more information about your question.

Michael Hardy · Answer 2 · 2017-04-28T22:49:05.027

2

I think to preserve the theorem in cases like this one should define the MLE as the interval of MLEs. That is a function of the sufficient statistic.

This page takes a different point of view: For every sufficient statistic, there is at least one MLE that is a function of it. (So if there is only one MLE, then that one is it.)

edited Apr 28 '17 at 22:49

answered Apr 28 '17 at 22:35

Michael Hardy

7,094
1
20
38

score 1 · Answer 3 · answered Jul 22 '19 at 08:35

1

Sufficient statistics only apply to exponential family distributions. Continuous uniform is not an exponential family distribution. See [DeGroot, Morris H. Optimal Statistical Decisions, 1970, McGraw-Hill Book Company, New York]

answered Jul 22 '19 at 08:35

Haotian Chen

653
3
8

Maximum likelihood estimator that is not a function of a sufficient statistic

3 Answers3