1

Context: I'm reading up on sampling in MCMC for Machine Learning. On page 5, it mentions rejection sampling:

repeat
  sample xi ~ q(x) and u ~ U(0,1)
  if uMq(xi) < p(xi):
    accept xi
until enough samples

(where $M$ is chosen such that $p(x) \le Mq(x)$ for all $x$)

Question: In the analysis, the paper says that it may work bad for high-dimensional settings. While I can see that, the paper gives a reason that I don't understand: $P(x \text{ accepted}) = P \left(u < \frac{p(x)}{Mq(x)} \right) = \frac{1}{M}$. This doesn't make sense to me. If the probability were constant, why even bother to evaluate $p(x)$? Should this be a "at most $\frac{1}{M}$? Or am I just misinterpreting the statement?

blubb
  • 2,458
  • 2
  • 19
  • 28

2 Answers2

1

The comment in the paper refers to the fact that the value is rejected (M-1) out of M times. The distribution of p(x) depends on the accepted values not on how frequently a chosen value is accepted.

Michael R. Chernick
  • 39,640
  • 28
  • 74
  • 143
1

The probability of rejection is just what it says, in your case it is dominated by $1/M$, but it's value depends on $x$. In higher dimension, the pdf $p(x)$ is multidimensional, but you still require a pdf $q(x)$ such that $Mq(x)$ strictly dominates $p(x)$. Because this can be hard to find if $p$ does not have a classical distribution, your best option is to take $q$ as a standard distribution (like multivariate Gaussian) and increase $M$, which makes the algorithm less efficient.

gui11aume
  • 13,383
  • 2
  • 44
  • 89