Why compare with uniform distributed values in metropolis-hastings?

Question

I'm a new starter in metropolis-hastings algorithm, having a problem in understanding its implementation of acceptance step:

  min{1, f(Y)/f(x)}

I understand when acceptance rate is larger than 1, the value is always accepted. But when the ratio is less than one, all the R code I have see is to compare with a uniform density value by

if(runif(1)<f(Y)/f(x))

Why are we supposed to compare with one uniform density value here? Is any specific reason for that?

The acceptance step calculation is of the probability $p = \min \{1, f(Y)/f(x)\}$ that we will accept that proposal. The probability that a uniform variate on $(0,1)$ is less than $p$ equals $p$, so the comparison of the uniform variate to $f(Y)/f(x)$ ensures that we accept the proposal with probability $p$. — jbowman, Oct 26 '16 at 22:37

score 2 · Answer 1 · answered Oct 26 '16 at 22:54

2

You need a way to decide to randomly generate a move with the required probability (i.e. to "move with probability $\alpha$")
computers have good ways to generate uniformly distributed random numbers (rather, pseudo-random, with characteristics that give us what we need).
by comparing a standard uniform random number with $\alpha$ we can say "move if $u<\alpha$" to take the proposal with the required probability (otherwise remaining where we are).

answered Oct 26 '16 at 22:54

Glen_b

257,508
32
553
939

So to be more specific, why comparing a std. uniform random number could ensure the require probability? What is the theory to back up this point? – rifle123 Oct 27 '16 at 04:32
The standard uniform has the property that its cdf is the identity function on $(0,1)$. That is $P(U\leq u) =u$. So $P(U\leq \alpha)=\alpha$. If this idea is not something you're familiar with I encourage you to try to learn some basic probability ideas (many of which MCMC relies on) before you try to understand MCMC itself. An introductory applied probability text might be a good place to start. – Glen_b Oct 27 '16 at 06:16

score 1 · Answer 2 · edited Apr 13 '17 at 12:44

If I understand you correctly, you seem to be asking why uniform distribution is helpful in Metropolis algorithm? Why do we use it to make non-uniform random draws? Actually the answer is that we use it because we can!

When laypeople talk about randomness, by "random" they usually mean drawn uniformly at random. Uniform sampling is such popular because it is most easy thing to do. Instead of discussing the Metropolis algorithm, let me focus on the most beloved probabilistic example: the biased coin. Imagine that you wan to sample outcomes from Bernoulli distribution (biased coin) with probability of success (heads) equal to $p$ and of failure (tails) equal to $1-p$. The first problem that you'd encounter is that biased coin is a physical impossibility, so you simply cannot mint it and throw a number of times. But you don't need the coin, instead you can produce $N$ coupons, write "heads" on $pN$ of them and "tails" on $(1-p)N$ of them, put them into box, shuffle and then draw uniformly with replacement. Since there will be $pN$ "heads" coupons in the box, you can expect to draw "heads" with probability $p$. This sampling schema can be easily extended to random variables with more then two outcomes, but with continuous random variables you'd need infinite number of coupons, so it would be practically impossible. It shows however that you can draw from a discrete non-uniform distribution using discrete uniform random number generator as a proxy.

What about sampling from continuous random variables? Actually the example above can be easily extended to continuous case and we can make it more general. Imagine that instead of coupons you produce a wheel of fortune divided into "heads" and "tails" areas, where length of the external border of the "heads" area is equal to $p$ and the reminder has $1-p$ length (see the ugly picture below).

If you spin this wheel, it will stop either on the "heads", or "tails" area with probabilities $p$ and $1-p$ what follows from the properties of continuous uniform distribution. Of course, instead of dividing the area of the wheel into two areas, you can divide it into any (possibly infinite up to measurement precision) number of them. Instead of dividing your wheel of fortune into a number smaller areas, you can instead divide it into two areas to simulate the biased coin and if you need to make a decision if you accept some outcome $x$ that appears with probability $p = f(x)$, you'll spin your wheel and let it decide -- in the end you'll accept $x$ with probability $p$ and reject it with probability $1-p$. This means that you can use such wheel of fortune for sampling from any distribution (it doesn't mean that this will be the most efficient way to go).

This is helpful not only for physical sampling games, but also for computer-based sampling. We have a number of uniform pseudorandom number generators and sampling uniform numbers is usually easier then sampling from other distributions, so it's a good starting point. You can also read about inverse transform sampling for other application of this idea. It applies also to Metropolis algorithm and to most of other similar algorithms.

score 0 · Answer 3 · answered Oct 27 '16 at 02:19

0

Going off of @jbowman's comment. It's a way for a computer to make a decision with some probability. The computer needs a decision rule and since we want it to move with the probability given by the ratio, a quick if(runif(1)<f(Y)/f(x)) (as you said) will make that decision with the correct probability.

answered Oct 27 '16 at 02:19

conv3d

626
5
12

Yes, that partly answers my questions. So going further, why using a uniform distributed number here could make sure that value is accepted with that probability given by the ratio. Is there theory to back up this? That is the main point confused me – rifle123 Oct 27 '16 at 04:30
@QCL: Hint: if $X$ has the uniform distribution on the interval $[0,1]$, what is the probability that $X\leq p$? – Juho Kokkala Oct 27 '16 at 04:36
@JuhoKokkala Ah, thx! – rifle123 Oct 27 '16 at 05:03

Why compare with uniform distributed values in metropolis-hastings?

3 Answers3