3

I came across this answer which states that:

NOT all the MCMC methods avoid the need for the normalising constant.

I was under the impression that one of the strengths of the MCMC methods (usually employed for parameter inference) is that they avoid the need for obtaining the Bayesian evidence, only requiring evaluating the likelihood and the prior.

Apparently this is not true, so my question is: what MCMC methods need to calculate the Bayesian evidence (normalising constant)?

Gabriel
  • 3,072
  • 1
  • 22
  • 49
  • There is a huge literature on using MCMC methods to calculate the normalising constant (evidence) and thus this question is too broad. – jaradniemi Nov 09 '18 at 14:55
  • 1
    The question is not really about *how* to calculate the normalizing constant but which MCMC methods employed in parameter inference do it and *why*. – Gabriel Nov 09 '18 at 14:59
  • @Gabriel (sorry accidentally entered comment before finishing), the evidence function $P(D)=\int_{\Theta} P(D|\theta)P(\theta)$ is a sum of the probability(up to a proportionality constant) of all possible models $\theta$. If you had a way to calculate this, then you wouldn't need to use an MCMC sampler. The answer you posted needs to be reworded. – curious_dan Nov 09 '18 at 16:04
  • @curious_dan I still don't understand, I haven't posted any answers. – Gabriel Nov 09 '18 at 16:10
  • Sorry, you said "I came across this answer" and then posted another thread. This is the answer that needs to be reworded. All MCMC samplers avoid the need for the normalizing constant. If you knew the normalizing constant, then you wouldn't use MCMC. – curious_dan Nov 09 '18 at 16:13
  • Oh I understand now. So basically, you are saying that my initial understanding of the MCMC process is correct: the evidence is in fact **not** calculated. Would you like to turn your comment into an answer? – Gabriel Nov 09 '18 at 16:23
  • @Gabriel, can you confirm that you are asking: "Is there an MCMC method that computes the evidence function?" --as opposed to-- "Is there an MCMC that requires the evidence function?" I think I was answering the latter. To answer the former--- we should consider that _in some sense_ the whole point of MCMC(using any approach) is to approximate the evidence function. If you confirm my understanding then I can post an answer that makes this more clear – curious_dan Nov 09 '18 at 17:33
  • @curious_dan I've edited the question to make it as clear as possible. The linked answer states that *not* all the MCMC methods avoid having to calculate the evidence, which means that there are *some* MCMC methods that need to calculate that value to work. – Gabriel Nov 10 '18 at 13:45

1 Answers1

1

You need to consider what the actual output from MCMC is: a very large set of samples $\{ \theta_1, \theta_2, \dots, \theta_N\}$ that you hope are representative of the posterior distribution, $\pi(\theta)$, say, of the parameter $\theta $.

MCMC proceeds by starting from some $\theta_0$ moving on to $\theta_1$ and so on until equilibrium is reached. Various indicative tests for convergence exist but proof that it has occurred is not available. It is assumed that after convergence each $\theta_i$ is a draw from the posterior distribution.

You are then in a position to examine the empirical distribution of the MCMC output and summarise it using all the usual tools. You do not need to calculate the evidence to do that.

JeremyC
  • 291
  • 1
  • 5
  • Hi, yes I understand all that. But that answer stated that there are in fact MCMC methods that do employ the evidence (normalizing constant), hence my question. If your answer is that there are no MCMC methods that require calculating the evidence to work, could you make that clear? – Gabriel Nov 09 '18 at 18:00
  • I cannot claim knowledge of every MCMC method that has ever been written - and there are lots. I cannot see why anyone would ever want to calculate the evidence as part of using MCMC, but they might use MCMC to calculate the evidence. – JeremyC Nov 09 '18 at 22:26