Hi, I am learning loopy belief propagation for MRF. The general roadmap is to define a Bethe approximation, which is exact for a tree but wrong for general graphs.
I'm currently stuck at the point to compute the Bethe entropy. Let's consider a pairwise tree in here (p. 21).
where $b(\cdot)$ is a belief (marginal distribution) of either a factor or a node.
The entropy is then computed as if it's a sum of the entropy of independent variables. However, $X_a$ overlaps with each other, and $X_a$ actually contains $x_i$ for some $i$. I don't know how this entropy is decomposed.
$$\begin{align} H_b=&-\sum_x\left(\prod_ab_a(X_a)\prod_ib_i(X_i)^{1-d_i}\right)\log\left(\prod_ab_a(X_a)\prod_ib_i(X_i)^{1-d_i}\right)\\ =&-\sum_x\left(\prod_ab_a(X_a)\prod_ib_i(X_i)^{1-d_i}\right)\left(\sum_a\log b_a(X_a)+\sum_i\log b_i(X_i)^{1-d_i}\right) \end{align}$$
Let's consider a particular term in the second parathesis associating with $X_{a^*}$:
$$-\sum_x \left(\prod\limits_a b_a(X_a)\cdot\prod_i b_i(X_i)^{1-d_i} \cdot \log b_{a^*}(X_{a^*})\right)$$
We cannot simply remove unnecessary factors in this term by rearranging the outer summation over $\{x_{a^*}\} \bigcup \mathcal{X}\backslash x_{a^*}$, because $X_a$ and $X_i$ are interwoven with each other.
I referred to Koller's book, and this part is also missing.
I am wondering if any one can point to the derivation of Bethe entropy. Thanks a lot.