Understand the Holmes and Held (2006) Bayesian probit MCMC algorithm

Question

Holmes and Held (2006) suggest a simple approach to reduce autocorrelation in the MCMC algorithm proposed by Albert and Chib (1993). HH (2006) propose to update $\beta$ and $z$ jointly, making use of the following factorisation,

$$ \pi(\beta, z|y) = \pi(z|y) \pi(\beta|z) $$

As far as I understand, as for $\pi(z|y)$, we have

\begin{array}{ccc} \pi(z|y) & \propto & \pi(y|z)\pi(z)\\ & = & \pi(y|z)\int_{\beta}\pi(z|\beta)\pi\left(\beta\right)d\beta \\ & = & \text{N}(\boldsymbol{0},I_{n}+xvx^{T})\times Ind(y,z) \end{array}

Since Direct sampling from the multivariate truncated normal is difficult, HH (2006) claim that it is straightforward to Gibbs sample the distribution

$$ z_{i}|\boldsymbol{z}_{-i},y_{i}\propto\begin{cases} N(m_{i},v_{i})\times I(z_{i}>0) & y_{i}=1\\ N(m_{i},v_{i})\times I(z_{i}\leq0) & \text{otherwise} \end{cases} $$

Now comes the mysterious part which I do not understand.

HH (2006) claim that $m_{i}$ and $v_{i}$ are the mean and variance of each $z_{i}$, which can be “obtained from the leave-one-out marginal predictive densities.” “Using, for example, Henderson & Searle (1981) we can calculate the parameters efficiently as”,

\begin{array}{ccc} m_{i} & = & x_{i}B-w_{i}(z_{i}-x_{i}B)\\ v_{i} & = & 1+w_{i}\\ w_{i} & = & h_{i}/(1-h_{i}) \end{array}

where $z_{i}$ is the current value for $z_{i}$,

$B=\left(V_{\beta}^{-1}+x^{T}x\right)^{-1}x^{T}z$,

and $h_{i}$ is the diagonal element of $x\left(V_{\beta}^{-1}+x^{T}x\right)^{-1}x^{T}$.

Now, my question is where do the formulas for $m_i$, $v_i$ and $w_i$ come from? I have been stuck in this problem for 3 weeks. I think it may be related to the derivation of the conditional distributions of a multivariate normal distribution, but they still seem quite different.

Could you help me? Thank you very much.

References:

Holmes, C. C., & Held, L. (2006). Bayesian auxiliary variable models for binary and multinomial regression. Bayesian analysis, 1(1), 145-168. -- Link: https://projecteuclid.org/download/pdf_1/euclid.ba/1340371078
Albert, J. and Chib, S. (1993). \Bayesian analysis of binary and polychotomous response data." Journal of the American Statistical Association, 88:669-679.
Henderson, H. and Searle, S. (1981). \On deriving the inverse of a sum of matrices."SIAM Rev., 23:53-60.

Using a plain Gibbs sampling would lead to $m_i$ and $v_i$ being the conditional mean and conditional variance, rather. — Xi'an, Aug 29 '19 at 05:49
@Xi'an Dear professor, thank you very much for your comment. I am quite sorry that I still could not quite understand why it is the case. Do you mind giving me more details or some examples/references on the relevant mathematical derivations? — Sheng Bi, Aug 29 '19 at 14:33

Understand the Holmes and Held (2006) Bayesian probit MCMC algorithm

0 Answers0