In their textbook, Graphical Models, Exponential Families and Variational Inference, M. Jordan and M. Wainwright discuss the connection between Exponential families and Markov Random Fields (undirected graphical models).
I am trying to understand better the relationship between them with the following questions:
- Are all MRFs members of the Exponential families?
- Can all members from the Exponential families be represented as an MRF?
- If MRFs $\neq$ Exponential families, what are some good examples of distributions of one type not ncluded in the other ?
From what I understand in their textbook (Chapter 3), Jordan and Wainwright present the next argument:
Say we have a a scalar random variable X that follows some distribution $p$, and draw $n$ i.i.d. observations $X^1, \ldots X^n$, and we want to identify $p$.
We compute the empirical expectations of certain functions $\phi_\alpha%$
$\hat{\mu}_\alpha= \frac{1}{n}\sum^n_{i=1}\phi_\alpha(X^i), $ for all $\alpha \in \mathcal{I}$
where each $\alpha$ in some set $\mathcal{I}$ indexes a function $\phi_\alpha: \mathcal{X} \rightarrow R$
Then if we force the following two sets of quantities to be consistent, i.e. to match (to identify $p$):
The expectations $E_p[(\phi_\alpha(X)]=\int_\mathcal{X}\phi_\alpha(x)p(x)\nu(dx)$ of the sufficient statistics $\phi$ of the distribution $p$
The expectations under the empirical distribution
we get an underdetermined problem, in the sense that there are many distributions $p$ that are consistent with the observations. So we need a principle for choosing among them (to identify $p$).
If we use the principle of maximum entropy to remove this undeterminancy, we can get a single $p$:
$\DeclareMathOperator*{\argmax}{arg\,max} p^* = \argmax_{p\in{\mathcal{P}}} \,H(p)$ subject to $E_p[(\phi_\alpha(X)] = \hat{\mu}_\alpha$ for all $\alpha \in \mathcal{I}$
where this $p^*$ takes the form $p_\theta(x) \propto $ exp${\sum_{\alpha \in \mathcal{I}}\theta_\alpha \phi_\alpha(x)},$ where $\theta \in R^d$ represents a parameterization of the distribution in exponential family form.
In other words, if we
- Make the expectations of the distributions be consistent with the expectations under the empirical distribution
- Use the principle of maximum entropy to get rid of undetermination
$\rightarrow$ We end up with a a distribution of the exponential family.
However, this looks more like an argument to introduce exponential families, and (as far as I can understand) it does not describe the relationship between MRFs and exp. families. Am I missing anything?