Deriving variational posterior on the Ising model using mean field approach

Question

I'm following an example from Murphy's book (Sec 21.3.2) on how to derive the mean field update equations to approximate a variational posterior on the Ising model

Problem:

let $x_i\in\{-1, 1\}$. We have a joint model of the form $$ p(\mathbf{x}, \mathbf{y})=p(\mathbf{x}) p(\mathbf{y} \mid \mathbf{x}) $$ with the prior of the form $$ p(\mathbf{x}) =\frac{1}{Z_{0}} \exp \left(\sum_{i=1}^{D} \sum_{j \in \mathrm{nbr}_{i}} W_{i j} x_{i} x_{j}\right) $$ and the likelihood of the form $$ p(\mathbf{y} \mid \mathbf{x})=\prod_{i} p\left(\mathbf{y}_{i} \mid x_{i}\right)=\exp \left(\sum_{i}-L_{i}\left(x_{i}\right)\right) $$ Suppose that we approximate the posterior distribution by a fully factored approximation $$ q(\mathbf{x})=\prod_{i} q\left(x_{i}, \mu_{i}\right) $$ where $\mu_i$ is the mean value of node $i$. Derive the update relations for $\mu_i$.

Book's description:

The goal of variational inference is to find a variational approximation $q(\mathbf{x})$ to the posterior distribution, $$ q(\mathbf{x})=\prod_{i=1}^{D} q_{i}\left(\mathbf{x}_{i}\right) $$ by solving the optimization problem $\min _{q_{1}, \ldots, q_{D}} \mathbb{K} \mathbb{L}(q \| p)$. It can be shown that at each step we do the following update $$ \log q_{j}\left(\mathbf{x}_{j}\right)=\mathbb{E}_{-q_{j}}[\log \tilde{p}(\mathbf{x})]+\text { const } $$ where $\tilde{p}(\mathbf{x})=p(\mathbf{x}, \mathcal{D})$ is the unnormalized posterior and the notation $\mathbb{E}_{-q_{j}}[f(\mathbf{x})]$ means to take the expectation over $f(\mathbf{x})$ with respect to all the variables except for $x_{j}$. For example, if we have three variables, then $$ \mathbb{E}_{-q_{2}}[f(\mathbf{x})]=\sum_{x_{1}} \sum_{x_{3}} q_1\left(x_{1}\right) q_{3}\left(x_{3}\right) f\left(x_{1}, x_{2}, x_{3}\right) $$

My attempt:

First I write down the posterior distribution \begin{align} p(\mathbf{x}|\mathbf{y})&=\frac{1}{Z}(p(\mathbf{x})p(\mathbf{y}|\mathbf{x}))\\ &=\frac{1}{Z}(\exp \left(\sum_{i=1}^{D} \sum_{j \in \mathrm{nbr}_{i}} W_{i j} x_{i} x_{j}\right)\exp \left(\sum_{i}-L_{i}\left(x_{i}\right)\right))\\ &=\frac{1}{Z}(\exp \left(\sum_{i=1}^{D} \sum_{j \in \mathrm{nbr}_{i}} W_{i j} x_{i} x_{j}-\sum_{i}L_{i}\left(x_{i}\right)\right))\\ \end{align} where Z is normalization constant. Then, I find $\log\tilde{p}(\mathbf{x})$ \begin{align} \log\tilde{p}(\mathbf{x}) &= \log(Z p(\mathbf{x}|\mathbf{y}))\\ &=\log\left(\exp \left(\sum_{i=1}^{D} \sum_{j \in \mathrm{nbr}_{i}} W_{i j} x_{i} x_{j}-\sum_{i}L_{i}\left(x_{i}\right)\right)\right)\\ &=\sum_{i=1}^{D} \left(x_{i}\sum_{j \in \mathrm{nbr}_{i}} W_{i j} x_{j}-L_{i}\left(x_{i}\right)\right) \end{align} Now I take the expectation $\mathbb{E}_{-q_i}[\log \tilde{p}(\mathbf{x})]$ as follows \begin{align} \mathbb{E}_{-q_i}[\log\tilde{p}(\mathbf{x})] &= \sum_{-x_i}\prod_{j\neq i}q_j(x_j)\log\tilde{p}(\mathbf{x})\\ &=\sum_{-x_i}\prod_{j\neq i}q_j(x_j)\sum_{i=1}^{D} \left(x_{i}\sum_{j \in \mathrm{nbr}_{i}} W_{i j} x_{j}-L_{i}\left(x_{i}\right)\right) \end{align} therefore \begin{align} q_i(\mathbf{x}_i) &= \exp\left( \mathbb{E}_{-q_i}[\log\tilde{p}(\mathbf{x})] + \text{const}\right)\\ &=\exp\left(\sum_{-x_i}\prod_{j\neq i}q_j(x_j)\sum_{i=1}^{D} \left(x_{i}\sum_{j \in \mathrm{nbr}_{i}} W_{i j} x_{j}-L_{i}\left(x_{i}\right)\right)+ \text{const}\right) \end{align}

Notes

The book goes on to get an update equation for the parameter $\mu_i$, bu I'm a bit confused on how they proceed. Would appreciate any comments on how to get there.

Murphy, Kevin P. Machine learning: a probabilistic perspective. MIT press, 2012. — Blade, Dec 15 '21 at 14:58

Deriving variational posterior on the Ising model using mean field approach

0 Answers0