In an astronomical context, the authors of a paper desire to use a Gibbs algorithm. Please note: I am inexperience in MCMC algorithms, and specifically in Gibbs sampling.
What we want, in essence, is the full posterior distribution, given some data: $P(X,Y|data)$. To achieve this, we sample from the marginal distributions $P(X|Y,data)$ and $P(Y|X,data)$. In the stationary situation, sampling from these would be statistically identical to sampling from the full joint probability distribution, as I understand it.
Now, the authors note that in this particular case,
" (...) unfortunately the two marginal distributions $P(X|Y,data)$ and $P(Y|X,data)$ are in general both much narrower than the marginalized distribution $P(X|data)$: given a particular a particular $Y$ [in this case, lensing potential], $X$ [in this case, the delensed CMB] is given essentially by a delta function. This means that naive Gibbs iterations will not converge within a reasonable time."
Now these statements are completely lost on me. So my question is twofold:
- The statement about the delta function is meant as explanatory supplement I think, but it doesn't clarify anything for me. Why is this 'essentially a delta function'?
- Second, and more importantly: OK, suppose the conditionals are 'much narrower'. So what? Why is $P(X|data)$ even relevant, aren't we interested in $P(X|Y,data)$, $P(Y|X,data)$ and ultimately $P(X,Y|data)$ only in Gibbs sampling? Why would narrow conditions mean slow convergence?
The paper in question is a review paper by Challinor and Lewis, 2006. The arxiv print can be found here:
http://arxiv.org/pdf/astro-ph/0601594v4.pdf
And the text I'm referring to is at the end of section 8, delensing the sky.