14

I am wondering about the different ways that Bayesian and Frequentist statistic connect with each other.

I recalled that the Maximum Likelihood estimate of a parameter $\theta$ is not necessarily an unbiased estimator of that parameter.

That made me wonder: Is the Mean Posterior estimate of $\theta$ an unbiased estimator?

That is,

Does $\phi(x)=E(\theta\mid x)$, imply $E(\phi(x)\mid\theta)=\theta$?

Note that this is indeed a meaningful question, since $\phi(x)$, while it is a Bayesian estimator, is simply a function from the data to the real line and so can also be seen as a classical frequentist estimator.

If this question cannot be answered in general, please assume the prior is uniform.

If not, is there some other Bayesian estimator (i.e. a function from the posterior to $\mathbb R$) that is always an unbiased estimator in the frequentist sense?

Michael Hardy
  • 7,094
  • 1
  • 20
  • 38
user56834
  • 2,157
  • 13
  • 35

1 Answers1

13

This is a meaningful question which answer is well-known: when using a proper prior $\pi$ on $\theta$, the posterior mean $\delta^\pi(x) = \mathbb{E}^\pi[\theta|x]$ cannot be unbiased. As otherwise the integrated Bayes risk would be zero: \begin{align*} r(\pi; \delta^\pi) &= \overbrace{\mathbb{E}^\pi\{\underbrace{\mathbb{E}^X[(\delta^\pi(X)-\theta)^2|\theta]}_{\text{exp. under likelihood}}\}}^{\text{expectation under prior}}\\ &= \mathbb{E}^\pi\{\mathbb{E}^X[\delta^\pi(X)^2+\theta^2-2\delta^\pi(X)\theta|\theta]\}\\ &= \mathbb{E}^\pi\{\mathbb{E}^X[\delta^\pi(X)^2+\theta^2]|\theta\}- \mathbb{E}^\pi\{\theta \mathbb{E}^X[\delta^\pi(X)|\theta]\}-\overbrace{\mathbb{E}^X\{\mathbb{E}^\pi[\theta|X]\delta^\pi(X)\}}^{\text{exp. under marginal}}\\ &= \mathbb{E}^\pi[\theta^2]+\underbrace{\mathbb{E}^X[\delta^\pi(X)^2]}_{\text{exp. under marginal}} -\mathbb{E}^\pi[\theta^2]-\mathbb{E}^X[\delta^\pi(X)^2]\\ & = 0 \end{align*} [Notations: $\mathbb{E}^X$ means that $X$ is the random variable to be integrated in this expectation, either under likelihood (conditional on $\theta$) or marginal (integrating out $\theta$) while $^π$ considers $θ$ to be the random variable to be integrated. Note that $\mathbb{E}^X[\delta^\pi(X)]$ is an integral wrt to the marginal, while $\mathbb{E}^X[\delta^\pi(X)|\theta]$ is an integral wrt to the sampling distribution.]

The argument does not extend to improper priors like the flat prior (which is not uniform!) since the integrated Bayes risk is infinite. Hence, some generalised Bayes estimators may turn out to be unbiased, as for instance the MLE in the Normal mean problem which is also a Bayes posterior expectation under the flat prior. (But there is no general property of unbiasedness for improper priors!)

A side property of interest is that $\delta^\pi(x) = \mathbb{E}^\pi[\theta|x]$ is sufficient in a Bayesian sense, as $$\mathbb{E}^\pi\{\theta|\mathbb{E}^\pi[\theta|x]\}=\mathbb{E}^\pi[\theta|x]$$Conditioning upon $\mathbb{E}^\pi[\theta|x]$ is the same as conditioning on $x$ for estimating $\theta$.

Xi'an
  • 90,397
  • 9
  • 157
  • 575
  • Thank you. Just a quick question, what exactly does $E^X(\cdot)$ mean? does it mean $E(\cdot|X)$? – user56834 Nov 12 '17 at 14:24
  • Ok. You say that this doesn't apply to flat priors. (why are flat priors not the same as uniform?). Does that mean that in the case of a flat prior, the result I'm looking for holds? – user56834 Nov 12 '17 at 14:34
  • So this would depend on the specific distribution $p(x|\theta)$? Is there no unbiased estimator that is unbiased, regardless of the distribution? – user56834 Nov 12 '17 at 14:55
  • That's right, but a bayesian estimator will take the correct distribution of $X$ into account, since it will be reflected in the posterior, correct? So thats why my thought was that maybe there is a bayesian estimator that works regardless of the distribution of $X$ – user56834 Nov 12 '17 at 16:21
  • If the prior is improper, there may be distributions on $X$ that lead to an improper posterior. – Xi'an Nov 12 '17 at 16:24
  • Note that this is question 4.7 in BDA3 (Gelman et al.) with solution here: http://www.stat.columbia.edu/~gelman/book/solutions3.pdf. I have a follow-up question, @Xi'an: suppose a proper normal prior with mean equal to the true mean, say mu0. Then the posterior mean is a weighted average of x-bar and mu0, which has expectation mu0 = the true mean. Yet the problem is not degenerate. What have I missed? – Sheridan Grant Aug 20 '19 at 20:57
  • ..I guess the problem is "degenerate" in the sense that the true distribution of theta is degenerate, but not in the sense that the posterior mean is constant. Should we maybe clarify that this only holds when the assumed prior is the true distribution of theta? – Sheridan Grant Aug 20 '19 at 21:32
  • 1
    @SheridanGrant: the "trüe mean" of the observable $\bar{x}$, $\mu_0$, is an unknown and hence varies within a range of possible. Unbiasedness addresses a frequentist property that must hold for _all_ values of $\mu_0$, not just _one_. In that sense it is impossible to centre the prior at $\mu_0$. – Xi'an Jun 17 '20 at 17:10