Why can Gibbs sampling outputs be used in Rao-Blackwellization?

Question

I'm currently learning Chib (1995)'s method to calculate the marginal likelihood of a Bayesian model using Gibbs sampling outputs. I'm stuck in the Rao-Blackwellization step.

Suppose $\mu$ and $\phi$ are the model parameters, and y is the data. The key step is to calculate $p(\mu,\phi|y)$, which can be written as $p(\mu |\phi,y)*p(\phi |y)$. The first part is the full conditional and known. The second part can be further written as $p(\phi |y)=\int p(\phi | \mu, y)*p(\mu|y) d\mu$. Based on Rao-Blackwell theorem,

$$p(\phi^* |y) \approx \dfrac{1}{N} \sum_{i=1}^{N} p(\phi^* | \mu^i, y)$$

As far as I understand, this is an average of the full conditional $p(\phi^* | \mu, y)$, over the posterior marginal distribution of $\mu$. So, we need to supply a set of realizations of $\mu$ from the posterior marginal distribution of $\mu$, i.e., $p(\mu|y)$.

However, Chib suggests that we can insert the Gibbs sampling outputs of $\mu$ into the summation. But aren't the outputs obtained from Gibbs about the joint posterior $p(\mu,\phi|y)$? Why suddenly can we use the results from joint distribution to replace the marginal distribution?

Some discussions about Gibbs and Rao-Blackwellization are here. But my question is not covered by the answers. Is this a misunderstanding about Gibbs sampling?

You question about marginal and joint has been answered here:https://stats.stackexchange.com/questions/213052/sampling-from-marginal-distribution-using-conditional-distribution/213057?noredirect=1#comment641484_213057 — Greenparker, Apr 15 '18 at 09:03

Xi'an · Answer 1 · 2018-04-15T13:36:45.937

Note: This issue has already been discussed on X Validated, but I cannot trace the items.Like this one.

This is a standard misunderstanding of multivariate simulation, a Monte Carlo version of the Law of the Unconscious Statistician: when creating a sample $\{(\mu_i,\phi_i),\,i=1,...,I\}$ from a joint posterior distribution $p(\mu,\phi|y)$ the subsamples $\{\mu_i,\,i=1,...,I\}$ and $\{\phi_i,\,i=1,...,I\}$are automatically distributed from the marginal posterior distribution $p(\mu|y)$ and $p(\phi|y)$, respectively. Hence, assuming the Gibbs sampler has reached its stationary regime, using the output of the sampler into $$\dfrac{1}{N} \sum_{i=1}^{N} p(\phi^* | \mu^i, y)$$ is valid since $$ \underbrace{\mathbb{E}^{\mu^i}[p(\phi^* | \mu^i, y)]}_{\text{expectation in }\mu^i}=\int p(\phi^* | \mu, y) p(\mu,\phi|y)\text{d}\mu\text{d}\phi = \int p(\phi^* | \mu, y) p(\mu|y)\text{d}\mu $$

Why can Gibbs sampling outputs be used in Rao-Blackwellization?

1 Answers1

Linked