10

The Rao-Blackwell Theorem states

Let $\hat{\theta}$ be an estimator of $\theta$ with $\Bbb E (\hat{\theta}^2) < \infty$ for all $\theta$. Suppose that $T$ is sufficient for $\theta$, and let $\theta ^ * = \Bbb E (\hat{\theta}|T)$ Then for all $\theta$, $$\Bbb E (\theta^* - \theta )^2 \leq \Bbb E (\hat{\theta} - \theta )^2$$ The inequality is strict unless $\hat{\theta}$ is a function of $T$

If I understand this theorem correctly, this states that, if I have a sufficient statistic $T$ for $\theta$, then the conditional expected value of $\hat{\theta}$ given $T$ is the solution to $\min_{\hat{\theta}} \Bbb E $$(\hat{\theta}-\theta)^2$

My Quesitons

  1. Am I correct that $\theta^*$ minimizes $\Bbb E $$(\hat{\theta}-\theta)^2$ ?
  2. Why does the Rao-Blackwell Theorem require $\Bbb E(\hat{\theta}^2) < \infty$?
  3. Why is the inequality strict unless $\hat{\theta}$ is a function of $T$ ?
Stan Shunpike
  • 3,623
  • 2
  • 27
  • 36
  • 1
    Possible duplicate of [Intuitive and Formulaic Justification for the Rao-Blackwell Theorem](http://stats.stackexchange.com/questions/123820/intuitive-and-formulaic-justification-for-the-rao-blackwell-theorem) – Xi'an Feb 26 '16 at 13:37
  • What is required to find $\min_{\hat{\theta}} \Bbb E $$(\hat{\theta}-\theta)^2$? – Stan Shunpike Feb 28 '16 at 00:01

2 Answers2

7
  1. No, $\theta^*$ is a better estimator than $\hat\theta$ but not necessarily the best (whatever that means!)
  2. If the estimator has no variance, then its risk is infinite and there is no guarantee that $\theta^*$ has a finite risk (even though this may happen as pointed out by Horst Grünbusch in his comments).
  3. Under finite variance for $\hat\theta$, the inequality is strict because of the variance decomposition as the sum of the expected conditional variance plus the variance of the conditional expectation $$\text{var}(\hat\theta)=\mathbb{E}_T[\text{var}(\hat\theta|T)]+ \text{var}_T(\mathbb{E}[\hat\theta|T])=\mathbb{E}_T[\text{var}(\theta|T)]+\text{var}_T(\theta^*)$$ Unless the expected conditional variance is zero, which amounts to $\hat\theta$ a function of $T$ only.
Xi'an
  • 90,397
  • 9
  • 157
  • 575
  • 1
    ad 2: Why is it impossible that $E(\hat{\theta}^2|T) – Horst Grünbusch Feb 26 '16 at 14:17
  • 1
    @HorstGrünbusch Why would the Cauchy piece go away when you condition on $T$? Also $\hat{\theta}$ is not an unbiased estimator. – dsaxton Feb 26 '16 at 15:51
  • 1
    @HorstGrünbusch It seems to me that your $\hat{\theta}$ does not even have a conditional expectation $\mid T$ (since $C$ does not have an expectation), thus $\theta^\ast$ would be undefined. – Juho Kokkala Feb 26 '16 at 15:53
  • 2
    OK, all I wanted was $C$ without variance, not without expectation. ;) Now take $C \sim t_2$, i.e. Student-t-distributed with 2 degrees of freedom and $E(C)=0$ and $C$ independent of $X$. Sufficient statistic is clearly $X$. Then $E(X+C|X) = E(X|X) + E(C|X) = X + E(C) = X$, but $\infty=Var(C) + Var(X) = Var(X+C)>Var(X+C|X)=\sigma^2$ – Horst Grünbusch Feb 26 '16 at 16:23
  • So I think it's wrong that a Rao-Blackwell estimator has necessarily infinite variance if the original estimator has infinite variance. (Yet even if both variances would necessarily be infinite $\infty \leq \infty$ would still hold.) – Horst Grünbusch Feb 26 '16 at 16:36
  • Side question: am I correct that if T isn't a sufficient statistic, then $\theta^*$ is no longer an estimate, but **the inequality still holds**? – mercury0114 Jun 18 '20 at 19:12
  • Yes this is correct: $\theta^*$ depends on the unknown $\theta$ and is no longer an estimate but the inequality remains. – Xi'an Jun 18 '20 at 19:58
7
  1. Note that being a sufficient statistic is not unique. Trivially, the whole data are sufficient, but conditioning an estimator on them doesn't change anything. So a sufficient statistic alone is not sufficient (pun!) for having minimal mean squared error. See the Lehmann-Scheffé-theorem, which uses the Rao-Blackwell-theorem in the proof, for a sufficient sufficiency (in fact, being sufficient and complete).

  2. If both are infinite, the weak inequality is always true. But then, as a counterexample, you can construct a sufficient statistic that is not a function of $T$ but has still infinite variance (such that only $\leq$ holds).

Take for example $C_1 \sim t_2 + \mu$, a shifted $t_2$-distributed random variable with $E(C_1) = \mu$ and $Var(C_1) = \infty$, and as another independent random variable $C_2 \sim t_2$. The parameter to estimate is $\mu$. Original estimator is $\hat{\theta} = C_1 + C_2$. A sufficient statistic is of course $C_1$. Both the Rao-Blackwell estimator $E(\hat{\theta}|C_1)=C_1$ and $\hat{\theta}$ have infinite variance. So the inequality would hold weakly. On the other hand, $C_1+C_2$ is not a mere function of $C_1$: It involves the other random variable, so that would be a contradiction to the last sentence you asked your 3rd question about. In fact, some textbooks admit infinite variance for the original estimator, but in turn they cannot state when $<$ holds.

  1. If $\hat{\theta}$ is a function of $T$, you can prove by the factorization theorem that $\hat{\theta}$ is already sufficient for $\theta$. So again we end up with improving nothing. Apart from this case, the inequality is strict, and that's the non-trivial assertion of the theorem.
Horst Grünbusch
  • 5,020
  • 17
  • 22