Why is rejection sampling with acceptance probability 2/3 for Beta(2,2) not slower than `rbeta(N,2,2)`?

Question

I was trying to illustrate that rejection sampling is inefficient when an alternative approach is available that does not throw away samples. (This post obviously is a candidate for migration to SO, but I feel that there might also be a statistical aspect to it.)

My example was going to be simulating $Beta(2,2)$ variates with a rejection algorithm and rbeta(N,2,2). My code is as follows:

# densityatmode <- dbeta(.5,2,2)=1.5
rejectionsampling_beta22 <- function(N){
     px <- runif(N,min=0,max=1)
     py <- runif(N,min=0,max=1.5)
     ikeep <- (py < 6*px*(1-px))
     return(px[ikeep])
}

samples <- 1500000

The acceptance probability is area under density/area in box, area under density of course equal to one, so 1/1.5, or equivalently, we need 1.5 as many candidate samples when using rejection.

library(microbenchmark)
result <- microbenchmark(rejectionsampling_beta22(1.5*samples),rbeta(samples,2,2))

The result, which is surprising to me, as I felt that rejection of 1/3 of the draws should be wasteful:

Warning message:
In microbenchmark(rejectionsampling_beta22(1.5 * samples), rbeta(samples,  :
  Could not measure a positive execution time for 33 evaluations.
> result
Unit: nanoseconds
                                    expr       min        lq         mean    median        uq       max neval cld
 rejectionsampling_beta22(1.5 * samples) 180171443 254891504 248482870.58 257005186 258866373 329533988   100  b 
                    rbeta(samples, 2, 2) 253091893 254717640 261602707.69 257391248 259902160 333055032   100   c
                                   neval         0         0        63.97         1         1       604   100 a

Xi'an · Accepted Answer · 2016-02-17T15:07:38.980

Here is Cheng & Feast Gamma generator code, on which R rbeta and rgamma functions are based:

function x=gamrnd_cheng(alpha)
% Gamma(alpha,1) generator using Cheng--Feast method
% Algorithm 4.35
c1=alpha-1; c2=(alpha-1/(6*alpha))/c1; c3=2/c1; c4=1+c3;
c5=1/sqrt(alpha);
flag=0;
while flag==0;
    U1=rand; U2=rand;
    if alpha>2.5
        U1=U2+c5*(1-1.86*U1);
    end
       W=c2*U2/U1;
     flag=(U1<1)&&(U1>0)&&(((c3*U1+W+1/W)<c4)||((c3*log(U1)-log(W)+W)<1));
end
x=c1*W;

which can be recycled into a Beta generator at about the same cost. It uses two uniforms, plus a rejection condition, so for the values of $(\alpha,\beta)$ that you picked, i.e., for a rejection probability of $1/3$, the accept-reject algorithm may be equally efficient. However, you should also run the comparison for larger non-integer values of $(\alpha,\beta)$ to check whether or not the Cheng & Feast Gamma generator remains efficient.

For instance, Joe Whittaker's Beta $\mathfrak{B}(\alpha,\beta)$ generator has a rejection condition of the form$$U_1^{1/\alpha}+U_2^{1/\beta}>1$$which occurs with increasing frequency as $\alpha$ and $\beta$ increase. I remember Luc Devroye mentioning that, for $\mathfrak{G}(\alpha,1)$ distributions, it is not possible to find a bound on the computing time that is independent of $\alpha$...

This sounds like matlab. I have added the reference to Dick Kroese's website where I found it. Along with other Beta generators. — Xi'an, Feb 17 '16 at 14:58

score 2 · Answer 2 · answered Feb 18 '16 at 13:24

Following up on Xi'an's suggestion, the result in the question indeed seems to be an artifact of the particular (small) parameters chosen. In particular, when choosing higher $\alpha$ and $\beta$, such that the beta density becomes more concentrated with a correspondingly higher density at the mode, such that a larger box needs to be constructed around the beta density in the basic rejection algorithm, such that the acceptance probability decreases, the result disappears and rbeta is much more efficient.

Here is an example for $\alpha=\beta=10.2$:

# densityatmode <- dbeta(.5,10.2,10.2)=3.559877
# precompute the beta density:
# gamma(20.4)/gamma(10.2)^2=1231365
rejectionsampling_betaab <- function(N){
  px <- runif(N,min=0,max=1)
  py <- runif(N,min=0,max=3.559877)
  #ikeep <- (py < dbeta(px,10.2,10.2))
  ikeep <- (py < 1231365*px^9.2*(1-px)^9.2)
  return(px[ikeep])
}

AcceptanceProbability <- 1/(3.559877)
samples <- 1e5

library(microbenchmark)
result <- microbenchmark(rejectionsampling_betaab(1/AcceptanceProbability*samples),rbeta(samples,10.2,10.2))

The result now clearly favors rbeta:

> result
Unit: milliseconds
                                                        expr       min        lq      mean    median        uq      max neval
 rejectionsampling_betaab(1/AcceptanceProbability * samples) 417.46861 426.16041 469.10049 430.37921 485.60090 635.9086   100
                                  rbeta(samples, 10.2, 10.2)  90.39748  90.94824  94.24759  91.52765  92.51329 264.5119   100

Why is rejection sampling with acceptance probability 2/3 for Beta(2,2) not slower than `rbeta(N,2,2)`?

2 Answers2