Simulate a Bernoulli variable with probability ${a\over b}$ using a biased coin

Question

Can someone tell me how to simulate $\mathrm{Bernoulli}\left({a\over b}\right)$, where $a,b\in \mathbb{N}$, using a coin toss (as many times as you require) with $P(H)=p$ ?

I was thinking of using rejection sampling, but could not nail it down.

Is this a question originally from a course or textbook? If so, please add the `[self-study]` tag & read its [wiki](http://stats.stackexchange.com/tags/self-study/info). Note that there's no need to put a plea for help at the end of your question - we know that everyone who posts here is hoping for help! — Silverfish, Apr 24 '16 at 22:19
There is an excellent post by @Glen_b here somewhere (though I can't remember where) about why there's no such thing as a "biased coin with probability $p$", but I know this is only a peripheral issue to your question! — Silverfish, Apr 24 '16 at 22:23
@Silverfish :Sorry , but this is not a textbook question, although the way I have framed it makes it looks so. It came to my mind while trying to bridge the gap between classical and non-classical(if I may use the term ) probability. — AbracaDabra, Apr 24 '16 at 23:49
1. You can't successfully @ someone who hasn't participated in a post. $\,$ 2. @SIlverfish's comment wasn't suggesting the post he referred to answered your question -- in fact he clearly indicates it doesn't. Nor did he anywhere suggest I should help you. — Glen_b, Apr 25 '16 at 02:34
Does $p$ have to be a rational number? I have a feeling you'd run into problems otherwise. If $p$ does need to be rational, then you should be able to assume without loss of generality that $p = 1/2$. — dsaxton, Apr 25 '16 at 02:40
@dsaxton I think this can be done even if $p$ is irrational. — Glen_b, Apr 25 '16 at 02:40
@dsaxton :please explain how does without loss of generality come into picture.. — AbracaDabra, Apr 25 '16 at 03:23
@Glen_b :Can you explain further how to work with irrational $p$'s? — AbracaDabra, Apr 25 '16 at 03:23
@dsaxton The question says "as many as you require"; it will be finite with probability 1, but not bounded (you might exceed any fixed number of tosses) but objecting on that basis would be like saying "toss a fair coin until you get a head" is not viable as a method for generating geometric($\frac{_1}{^2}$ random numbers. — Glen_b, Apr 25 '16 at 06:08
@AbracaDabra Is this an exercise for a class? If not, how does it arise? — Glen_b, Apr 25 '16 at 06:08
@Glen_b : It is not an exercise from my class. It just occurred to me in this chain of thought... : According to classical probability , take a fair coin, as you increase the number of tosses, ratio of ${\#Heads\over \#tails}$ converges to half. So it must be true for biased ones too... It means to get a coin to converge to a particular number, you need the $P(H)$ to be that number. Now I thought, what if we wish to produce a number , but we have a coin with $P(H)$ an other number(known or unknown) ? — AbracaDabra, Apr 25 '16 at 10:50
This problem is connected to the [Bernoulli factory problem](http://portal.acm.org/citation.cfm?id=175019), which aims at (unbiasedly) estimating $f(p$) from simulating an iid sequence of Bernoulli $B(p)$ when $f$ is a known transform. — Xi'an, Apr 26 '16 at 14:48

score 8 · Accepted Answer · answered Apr 25 '16 at 17:12

Because there are uncountably many solutions, let's find an efficient one.

The idea behind this one starts with a standard way to implement a Bernoulli variable: compare a uniform random variable $U$ to the parameter $a/b$. When $U \lt a/b$, return $1$; otherwise, return $0$.

We can use the $p$-coin as a uniform random number generator. To generate a number $U$ uniformly within any interval $[x, y)$, flip the coin. When it's heads, recursively generate a uniform value $X$ in the first $p$ part of the interval; when it's tails, recursively generate $X$ from the last $1-p$ part of the interval. At some point the target interval will become so small that it doesn't really matter how you pick a number from it: that's how the recursion gets started. It's obvious this procedure generates uniform variates (up to any desired precision), as is easily proven by induction.

This idea is not efficient, but it leads to an efficient method. Since at each stage you are going to draw a number from some given interval $[x,y)$, why not first check whether you need to draw it at all? If your target value lies outside this interval, you already know the result of the comparison between the random value and the target. Thus, this algorithm tends to terminate rapidly. (This could be construed as the rejection sampling procedure requested in the question.)

We can optimize this algorithm further. At any stage, we actually have two coins we can use: by relabeling our coin we can make it into one that is heads with chance $1-p$. Therefore, as a precomputation we may recursively choose whichever relabeling leads to the lower expected number of flips needed for termination. (This calculation can be an expensive step.)

For example, it's inefficient to use a coin with $p=0.9$ to emulate a Bernoulli$(0.01)$ variable directly: it takes almost ten flips on average. But if we use a $p=1-0.0=0.1$ coin, then in just two flips we will sure to be done and the expected number of flips is just $1.2$.

Here are the details.

Partition any given half-open interval $I = [x, y)$ into the intervals

$$[x,y) = [x, x + (y-x)p) \cup [x + (y-x)p, y) = s(I,H) \cup s(I,T).$$

This defines the two transformations $s(*,H)$ and $s(*,T)$ which operate on half-open intervals.

As a matter of terminology, if $I$ is any set of real numbers let the expression

$$t \lt I$$

mean that $t$ is a lower bound for $I$: $t \lt x$ for all $x \in I$. Similarly, $t \gt I$ means $t$ is an upper bound for $I$.

Write $a/b = t$. (In fact, it will make no difference if $t$ is real instead of rational; we only require that $0 \le t \le 1$.)

Here is the algorithm to produce a variate $Z$ with the desired Bernoulli parameter:

Set $n=0$ and $I_n = I_0 = [0,1)$.
While $(t\in I_{n})$ {Toss the coin to produce $X_{n+1}$. Set $I_{n+1} = S(I_n, X_{n+1}).$ Increment $n$.}
If $t \gt I_{n+1}$ then set $Z=1$. Otherwise, set $Z=0$.

Implementation

To illustrate, here is an R implementation of the alorithm as the function draw. Its arguments are the target value $t$ and the interval $[x,y)$, initially $[0,1)$. It uses the auxiliary function s implementing $s$. Although it does not need to, also it tracks the number of coin tosses. It returns the random variable, the count of tosses, and the last interval it inspected.

s <- function(x, ab, p) {
  d <- diff(ab) * p
  if (x == 1) c(ab[1], ab[1] + d) else c(ab[1] + d, ab[2])
}
draw <- function(target, p) {
  between <- function(z, ab) prod(z - ab) <= 0
  ab <- c(0,1)
  n <- 0
  while(between(target, ab)) {
    n <- n+1; ab <- s(runif(1) < p, ab, p)
  }
  return(c(target > ab[2], n, ab))
}

As an example of its use and test of its accuracy, take the case $t=1/100$ and $p=0.9$. Let's draw $10,000$ values using the algorithm, report on the mean (and its standard error), and indicate the average number of flips used.

target <- 0.01
p <- 0.9
set.seed(17)
sim <- replicate(1e4, draw(target, p))

(m <- mean(sim[1, ]))                           # The mean
(m - target) / (sd(sim[1, ]) / sqrt(ncol(sim))) # A Z-score to compare to `target`
mean(sim[2, ])                                  # Average number of flips

In this simulation $0.0095$ of the flips were heads. Although lower than the target of $0.01$, the Z-score of $-0.5154$ is not significant: this deviation can be attributed to chance. The average number of flips was $9.886$--a little less than ten. If we had used the $1-p$ coin, the mean would have been $0.0094$--still not significantly different than the target, but only $1.177$ flips would have been needed on average.

I can't help but see the similarities between this solution and the Solution 2 in my answer. Whereas I assume an unbiased coin (PS really interesting solution to the biased coin problem), and do all the calculations/comparisons in base-2, you do all calculations/comparisons in base 10. What are your thoughts? — Cam.Davidson.Pilon, Apr 26 '16 at 03:25
@cam I think you might be deceived by my examples: although they use nice numbers in base 10, the construction has nothing whatsoever to do with any particular base. — whuber, Apr 26 '16 at 12:52
(+1) Very neat resolution. The optimisation stands in upper- and lower-bounding $a/b$ by powers like $p^n(1-p)^m$ and/or ${n+m\choose m}p^n(1-p)^m$. It would be nice to find the optimal dichotomy in terms of the number of simulated Bernoullis. — Xi'an, May 01 '16 at 19:28

score 5 · Answer 2 · edited May 23 '17 at 12:39

Here's a solution (bit of a messy one, but it's my first stab). You can actually ignore the $P(H) = p$ and WLOG assume $P(H)=1/2$. Why? There exists a clever algorithm to generate a unbiased coin flip from two biased coin flips. So we can assume $P(H)=1/2$.

To generate a $\text{Bernoulli}(\frac{a}{b})$, I can think of two solutions (the first is not my own, but the second is a generalization):

Solution 1

Flip the unbiased coin $b$ times. If $a$ heads are not present, start over. If $a$ heads are present, return whether the first coin is a heads or not (because $P(\text{first coin is heads | $a$ heads in $b$ coins}) = \frac{a}{b}$)

Solution 2

This can be extended to any value of $\text{Bernoulli}(p)$. Write $p$ in binary form. For example, $0.1 = 0.0001100110011001100110011... \text{base 2}$

We'll create a new binary number using coin flips. Start with $0.$, and add digits depending on if a heads (1) or tails (0) appears. At each flip, compare your new binary number with the binary representation of $p$ up to the same digit. Eventually the two will diverge, and return if $bin(p)$ is greater than your binary number.

In Python:

def simulate(p):
    binary_p = float_to_binary(p)
    binary_string = '0.'
    index = 3
    while True:
        binary_string += '0' if random.random() < 0.5 else '1'
        if binary_string != binary_p[:index]:
            return binary_string < binary_p[:index]
        index += 1

Some proof:

np.mean([simulate(0.4) for i in range(10000)])

is about 0.4 (not fast however)

Nice answer, but can you explain with your method 1 how to do for irrational p? — AbracaDabra, Apr 25 '16 at 04:06
@AbracaDabra: whatever the value of $p$, the probability of getting $(0,1)$ and $(1,0)$ are the same, namely $p(1-p)$, hence the probability of getting one against the other is $1/2$. — Xi'an, Apr 27 '16 at 09:12

score 4 · Answer 3 · edited Jun 11 '20 at 14:32

I see a simple solution, but no doubt there are many ways to do it, some presumably simpler than this. This approach can be broken down into two steps:

Generating from two events with equal probability given an unfair coin-tossing procedure (the combination of the particular coin and the method by which it is tossed generating a head with probability $p$). We can call these two equally probable events $H^*$, and $T^*$. [There's a simple approach for this that requires taking pairs of tosses $H^*=(H,T)$ and $T^*=(T,H)$ to produce two equally-likely outcomes, with all other outcomes leading to generating a new pair of rolls to try again.]
Now you generate a random walk with two absorbing states using the simulated fair coin. By choosing the distance of the absorbing states from the origin (one above and one below it), you can set the chance of absorption by say the upper absorbing state to be a desired ratio of integers. Specifically, if you place the upper absorbing barrier at $a$ and the lower one at $-(b-a)$ (and start the process from the origin), and run the random walk until absorption, the probability of absorption at the upper barrier is $\frac{a}{a+(b-a)} = \frac{a}{b}$.

(There's some calculations to be done here to show it, but you can get the probabilities out fairly easily by working with recurrence relations ... or you can do it by summing infinite series ... or there are other ways.)

Simulate a Bernoulli variable with probability ${a\over b}$ using a biased coin

3 Answers3

Implementation

Solution 1

Solution 2

Linked

Related