Trying to make sense of claims regarding Rao-Blackwell and Lehmann-Scheffé for sufficient/complete statistics

Question

I am currently trying to learn the two related concepts of the Rao-Blackwell theorem and the Lehmann-Scheffé theorem.

Assume we have the random sample $X_1, \dots, X_n$ with mean $\mu$ and variance $\sigma^2 < \infty$. We have that $E[S^2] = \sigma^2$, where $S^2 = \sum_{i = 1}^n \dfrac{(X_i - \bar{X})^2}{n - 1}$ and $\bar{X} = \sum_{i = 1}^n \dfrac{X_i}{n}$. Now assume the $X_i$ are Poisson random variables with parameter $\lambda$. My understanding is that, using Lehmann-Scheffé, we get that $E[S^2 \mid \bar{X}] = \bar{X}$. Then, using the law of total variance, we get that $\text{Var}(S^2) > \text{Var}(\bar{X})$. Based on what I've read, the two above theorems imply that, if we define, say, a sufficient statistic $T_1(\mathbf{X})$ and a complete sufficient statistic $T_2(\mathbf{X})$ for some parameter $\varphi$, then, under some conditions, we can say that $\text{Var}(T_1(\mathbf{X})) > \text{Var}(T_2(\mathbf{X}))$. However, I'm having trouble understanding this last part. I've read over various notes on the subject, but I'm still not sure I understand what it's saying. Why can we say that $\text{Var}(T_1(\mathbf{X})) > \text{Var}(T_2(\mathbf{X}))$? And what are these 'conditions' that make this inequality valid?

I can't think of an example of an incomplete and sufficient statistic and another complete and sufficient statistic *for the same probability model*. R-B says if you take an unbiased statistic and condition on the sufficient statistic, you will get a new estimator with lower variance but there may be other unbiased estimators with even lower variance. L-S says if it's complete and sufficient, you get the UMVUE. — AdamO, Apr 14 '21 at 17:15
I think the Hodge's Superefficient estimator is a simple counter example that actually helps understand how ill-formed some of this early variance bound research was: https://en.wikipedia.org/wiki/Hodges%27_estimator — AdamO, Apr 14 '21 at 17:21
@AdamO My research turned up this https://math.stackexchange.com/q/2881768/356308 — The Pointer, Apr 14 '21 at 17:21
For a *Poisson* model, the variance = mean. You have a normal model here! — AdamO, Apr 14 '21 at 17:23
@AdamO Sorry about that. I've been trying to read from multiple documents explaining the same thing, so I got confused when typing out my question. Is that better? — The Pointer, Apr 14 '21 at 17:31
Please avoid linking to documents without providing the context as it is unrealistic to expect people to read these before addressing the question. — Xi'an, Apr 14 '21 at 17:55
@Xi'an Sorry about that. I will not link anymore documents. Is my question ok? As I said, I don't really understand what it's saying myself, so I hope I managed to get the message across. — The Pointer, Apr 14 '21 at 18:01
@Xi'an I thought one would distinguish $T$ and $(T,T)$ in that $T$ is *minimally* complete & sufficient, whereas $(T,T)$ would simply be complete, sufficient. Is that not correct? — AdamO, Apr 14 '21 at 18:16
If $S^2$ and $\bar{X}$ are unbiased estimators for $\lambda$, then could it mean that $T_1(\mathbf{X})$ and $T_2(\mathbf{X})$ are $S^2$ and $\bar{X}$? We can say that, if $S^2$ is an unbiased estimator of $\theta$ and $\bar{X}$ is a sufficient statistic for $\theta$, then, since we know that $E[S^2|\bar{X}] = \bar{X}$, (1) $\bar{X}$ is an unbiased estimator of $\theta$ (that is, $E(\bar{X}) = \theta$) and (2) $\text{Var}(\bar{X}) \le \text{Var}(S^2)$. (1) and (2) are just the Rao-Blackwell theorem, right? — The Pointer, Apr 14 '21 at 18:41
Your phrase should be *"if we define, say, a statistic $T_1(\mathbf{X})$ and a sufficient (complete) statistic $T_2(\mathbf{X})$ for some parameter $\varphi$ then, under some conditions, we can say that $\text{Var}(T_3(\mathbf{X})) \leq \text{Var}(T_1(\mathbf{X}))$, where $T_3(\mathbf{X})=E[T_1(\mathbf{X})\, | \,T_2(\mathbf{X})]$. "* **1:** This $T_3$ is a different estimator from $T_1$ and $T_2$, but it is related. **2:** $T_1$ need not be sufficient. **3:** $T_2$ doesn't need to be complete (for RB theorem and the inequality can be equality as well). — Sextus Empiricus, Apr 22 '21 at 05:50
@SextusEmpiricus Oh my goodness, you're right. It seems that *I* completely screwed it up. The problem is that this was all based on memory, so I was trying to recall the idea and then ask. And it seems that, in doing so, I confused myself and wrote the wrong thing! My apologies! Given what I have written, it seems that what Thomas wrote is absolutely correct. Again, I apologise for any confusion. — The Pointer, Apr 22 '21 at 05:58
Actually, the phrase needs another addition and that is that $T_1$ must be an unbiased statistic. — Sextus Empiricus, Apr 22 '21 at 05:59
When I read this question first I thought "what is going on here?". AdamO had the same idea, how do you get to consider two sufficient statistics and only one of them complete? Thomas wrote an excellent answer to explain it. — Sextus Empiricus, Apr 22 '21 at 06:02
@SextusEmpiricus yes, I will leave this question as is, because, despite the fact that this wasn’t the question I wanted to ask, it attracted a valuable answer and is quite educational. If I decide to ask the intended question sometime in the future, then it is best done as a new question. — The Pointer, Apr 22 '21 at 06:06

Thomas Lumley · Accepted Answer · 2021-04-22T05:26:18.863

There is a complete sufficient statistic for $\theta$ in a model ${\cal P}_\theta$ if and only if the minimal sufficient statistic is complete (according to Lehmann "An Interpretation of Completeness and Basu’s Theorem"). This means you can't have distinct $T_1(X)$ and $T_2(X)$ the way you want. As the paper says (first complete paragraph of the second column)

On the other hand, existence of a complete sufficient statistic is equivalent to the completeness of the minimal sufficient statistic, and hence is a property of the model ${\cal P}$.

That is, in any given model, if $T_2$ is complete sufficient and $T_1$ is sufficient, $T_1$ is also complete sufficient.

The two theorems say

1/ Rao-Blackwell: Conditioning on any sufficient statistic will reduce the variance. This follows from the law of total variance

2/ Lehmann-Scheffé: In the special case that the model has a complete sufficient statistic, you get a fully efficient estimator.

In the Poisson case, the minimal sufficient statistic $\bar X$ is complete and the two estimators are identical.

There's an interesting example here of a situation where a Rao-Blackwell-type estimator is not fully efficient (not even admissible). The model is $X\sim U[\theta(1-k),\theta(1+k)]$ for known $k$ and unknown $\theta$. The Cramér-Rao bound does not apply, since the range of $X$ depends on $\theta$. A sufficient statistic is the pair $(\min X_1, \max X_n)$, and any observation is an unbiased estimator, however $E[X_1|(\min X_1, \max X_n)]$ is not even the best linear function of the two components of the sufficient statistic.

A couple more points to fill in potential gaps:

This leaves open is whether there might be an unbiased estimator attaining the Cramér-Rao bound that isn't obtainable using the Lehmann-Scheffé theorem. There isn't (in reasonably nice models): any model where the bound is attained has a score function of the form $$\frac{\partial \ell}{\partial \theta}=I(\theta)(g(x)-\theta)$$ for some $g()$ (where $I()$ is the information), in which case $g(x)$ is a both a complete sufficient statistic for $\theta$ and the minimum variance unbiased estimator.
As @AdamO indicates, none of this translates tidily to asymptotics: there are asymptotically unbiased estimators that beat the asymptotic information bound at a point (Hodges superefficient estimator) and even on a dense set of measure zero(Le Cam's extension of Hodge's estimator). The best you can do is the Local Asymptotic Minimax theorem, which says you can't beat an 'efficient' estimator uniformly over neighbourhoods of $\theta_0$ with diameter $O(n^{-1/2})$

Thanks again, Thomas. Are you sure about what you said in the first part? Isn't $E[S^2 \mid \bar{X}] = \bar{X}$ the special case of Lehmann-Scheffé? I was told that, if we have two statistics, $T_2(\mathbf{X})$ and $T_1(\mathbf{X})$ are a complete sufficient statistic and statistic, respectively, for some parameter of interest $\theta$, then $\text{Var}(T_1(\mathbf{X})) > \text{Var}(T_2(\mathbf{X}))$. So my question is then as follows: under what conditions does this inequality hold? — The Pointer, Apr 19 '21 at 18:49
The point is it *never* holds for the same $\theta$. If $T_2$ is complete sufficient and $T_1$ is sufficient, then $T_1$ will also be complete. — Thomas Lumley, Apr 19 '21 at 21:51
Hmm, that's very interesting. Is there are theorem that says this, or something that I can point to to justify this reasoning ("this never holds for the same $\theta$, because ...")? — The Pointer, Apr 19 '21 at 21:53
The first link is to a paper by Lehmann that says the minimal sufficient statistic is complete in any model with a complete sufficient statistic — Thomas Lumley, Apr 20 '21 at 01:15
This is a great answer, thanks! I was very confused as to how to know when a Rao-Blackwell-type estimator is an MVUE; I see now that completeness is the key. Are there any results on when statistics obtained from factorization are complete? — Adrian Keister, Jun 21 '21 at 21:57

Trying to make sense of claims regarding Rao-Blackwell and Lehmann-Scheffé for sufficient/complete statistics

1 Answers1

Linked