1

In the context of MCMC sampling, we often say that the posterior distribution is only proportional to the numerator of Bayes Law. We tend to say that the "denominator" (i.e. the normalizing constant) is irrelevant.

Just to clarify, is this because the denominator term in theory is a "constant number" (i.e. a scalar), and the likelihood when multipled by a scalar doesn't really change anything?

Or is there another reason?

Thanks

stats_noob
  • 5,882
  • 1
  • 21
  • 42

1 Answers1

2

The posterior distribution of a Bayesian model with parameters $\theta$, given data $X$ is

$$ p(\theta|X) = \frac{p(X|\theta)p(\theta)}{p(X)} $$

The denominator here functions to standardise the numerator such that its total density is one (ie, it is a proper probability distribution). In an ideal case we would calculate the numerator and denominator, obtaining a probability distribution in closed form (for example, a normal distribution). This would allow us to characterise the posterior easily, to calculate quantities like the mean, standard deviation, high posterior density regions, etc, easily and quickly.

In most cases, we can't do that, because the numerator is too complex to calculate manually. This is where MCMC comes in. By taking steps in the parameter space using probabilities proportional to the value of the numerator at each point that we step to/from, we can generate samples from the posterior distribution without knowing what form it actually takes. The neat part of this is that we can simply ignore the complex integral in the denominator/normalising constant but still get samples from a probability distribution.

When you say

likelihood when multipled by a scalar doesn't really change anything

you might mean "the numerator", because the likelihood here is also combined with the prior.

So to answer, we ignore the denominator or normalising constant because we can, and because it doesn't affect the output of our inference.

alan ocallaghan
  • 1,108
  • 7
  • 19