Context: I'm reading up on sampling in MCMC for Machine Learning. On page 5, it mentions rejection sampling:
repeat
sample xi ~ q(x) and u ~ U(0,1)
if uMq(xi) < p(xi):
accept xi
until enough samples
(where $M$ is chosen such that $p(x) \le Mq(x)$ for all $x$)
Question: In the analysis, the paper says that it may work bad for high-dimensional settings. While I can see that, the paper gives a reason that I don't understand: $P(x \text{ accepted}) = P \left(u < \frac{p(x)}{Mq(x)} \right) = \frac{1}{M}$. This doesn't make sense to me. If the probability were constant, why even bother to evaluate $p(x)$? Should this be a "at most $\frac{1}{M}$? Or am I just misinterpreting the statement?