2

I am trying to understand the following claim which is made in the Deep learning book by Goodfellow et. al about a toy energy-based model (with the apparent motivation of introducing Markov Chain Monte Carlo methods):

To understand why drawing samples from an energy-based model (EBM) is difficult, consider the EBM over just two variables, defining a distribution $p(a,b)$. In order to sample $a$, we must draw from $p(a|b)$, and in order to sample $b$, we must draw it from $p(b|a)$. It seems to be an intractable chicken-and-egg problem.

This sounds strange to me as I don't understand what prevents us from computing one of the marginals, say, $p(a)$ by marginalising over $b$, sampling from $p(a)$ and then sampling from $p(b|a)$. I don't really see why we have to deal with the chicken-an-egg problem that the author mentions. Why is my reasoning wrong?

jpmuc
  • 12,986
  • 1
  • 34
  • 64
Ash
  • 239
  • 1
  • 10
  • 2
    If only you had the joint density, you could try that. But you don't seem to have that. – Glen_b Dec 30 '19 at 02:27
  • @Glen_b-ReinstateMonica My understanding was that in an energy-based model, you have already an expression $p(X)=exp(-E(X))$ where $E(X)$ is a sum of energy terms defined on clicks. As a result I'd expect us to already know $p(a,b)$. Aren't undirected graphs used to express joint distribution in general? – Ash Dec 30 '19 at 09:10
  • Being able to compute the density $p(a,b)$ does not mean you can simulate from the distribution associated with $p(a,b)$. – Xi'an Dec 30 '19 at 10:45
  • @Xi'an My apologies as you surely find this trivial, but could you please elaborate? – Ash Dec 30 '19 at 11:57
  • 2
    [Why is it necessary to sample from the posterior distribution if we already KNOW the posterior distribution?](https://stats.stackexchange.com/q/307882/7224) – Xi'an Dec 30 '19 at 13:08

0 Answers0