Can someone tell me how to simulate $\mathrm{Bernoulli}\left({a\over b}\right)$, where $a,b\in \mathbb{N}$, using a coin toss (as many times as you require) with $P(H)=p$ ?
I was thinking of using rejection sampling, but could not nail it down.
Can someone tell me how to simulate $\mathrm{Bernoulli}\left({a\over b}\right)$, where $a,b\in \mathbb{N}$, using a coin toss (as many times as you require) with $P(H)=p$ ?
I was thinking of using rejection sampling, but could not nail it down.
Because there are uncountably many solutions, let's find an efficient one.
The idea behind this one starts with a standard way to implement a Bernoulli variable: compare a uniform random variable $U$ to the parameter $a/b$. When $U \lt a/b$, return $1$; otherwise, return $0$.
We can use the $p$-coin as a uniform random number generator. To generate a number $U$ uniformly within any interval $[x, y)$, flip the coin. When it's heads, recursively generate a uniform value $X$ in the first $p$ part of the interval; when it's tails, recursively generate $X$ from the last $1-p$ part of the interval. At some point the target interval will become so small that it doesn't really matter how you pick a number from it: that's how the recursion gets started. It's obvious this procedure generates uniform variates (up to any desired precision), as is easily proven by induction.
This idea is not efficient, but it leads to an efficient method. Since at each stage you are going to draw a number from some given interval $[x,y)$, why not first check whether you need to draw it at all? If your target value lies outside this interval, you already know the result of the comparison between the random value and the target. Thus, this algorithm tends to terminate rapidly. (This could be construed as the rejection sampling procedure requested in the question.)
We can optimize this algorithm further. At any stage, we actually have two coins we can use: by relabeling our coin we can make it into one that is heads with chance $1-p$. Therefore, as a precomputation we may recursively choose whichever relabeling leads to the lower expected number of flips needed for termination. (This calculation can be an expensive step.)
For example, it's inefficient to use a coin with $p=0.9$ to emulate a Bernoulli$(0.01)$ variable directly: it takes almost ten flips on average. But if we use a $p=1-0.0=0.1$ coin, then in just two flips we will sure to be done and the expected number of flips is just $1.2$.
Here are the details.
Partition any given half-open interval $I = [x, y)$ into the intervals
$$[x,y) = [x, x + (y-x)p) \cup [x + (y-x)p, y) = s(I,H) \cup s(I,T).$$
This defines the two transformations $s(*,H)$ and $s(*,T)$ which operate on half-open intervals.
As a matter of terminology, if $I$ is any set of real numbers let the expression
$$t \lt I$$
mean that $t$ is a lower bound for $I$: $t \lt x$ for all $x \in I$. Similarly, $t \gt I$ means $t$ is an upper bound for $I$.
Write $a/b = t$. (In fact, it will make no difference if $t$ is real instead of rational; we only require that $0 \le t \le 1$.)
Here is the algorithm to produce a variate $Z$ with the desired Bernoulli parameter:
Set $n=0$ and $I_n = I_0 = [0,1)$.
While $(t\in I_{n})$ {Toss the coin to produce $X_{n+1}$. Set $I_{n+1} = S(I_n, X_{n+1}).$ Increment $n$.}
If $t \gt I_{n+1}$ then set $Z=1$. Otherwise, set $Z=0$.
To illustrate, here is an R
implementation of the alorithm as the function draw
. Its arguments are the target value $t$ and the interval $[x,y)$, initially $[0,1)$. It uses the auxiliary function s
implementing $s$. Although it does not need to, also it tracks the number of coin tosses. It returns the random variable, the count of tosses, and the last interval it inspected.
s <- function(x, ab, p) {
d <- diff(ab) * p
if (x == 1) c(ab[1], ab[1] + d) else c(ab[1] + d, ab[2])
}
draw <- function(target, p) {
between <- function(z, ab) prod(z - ab) <= 0
ab <- c(0,1)
n <- 0
while(between(target, ab)) {
n <- n+1; ab <- s(runif(1) < p, ab, p)
}
return(c(target > ab[2], n, ab))
}
As an example of its use and test of its accuracy, take the case $t=1/100$ and $p=0.9$. Let's draw $10,000$ values using the algorithm, report on the mean (and its standard error), and indicate the average number of flips used.
target <- 0.01
p <- 0.9
set.seed(17)
sim <- replicate(1e4, draw(target, p))
(m <- mean(sim[1, ])) # The mean
(m - target) / (sd(sim[1, ]) / sqrt(ncol(sim))) # A Z-score to compare to `target`
mean(sim[2, ]) # Average number of flips
In this simulation $0.0095$ of the flips were heads. Although lower than the target of $0.01$, the Z-score of $-0.5154$ is not significant: this deviation can be attributed to chance. The average number of flips was $9.886$--a little less than ten. If we had used the $1-p$ coin, the mean would have been $0.0094$--still not significantly different than the target, but only $1.177$ flips would have been needed on average.
Here's a solution (bit of a messy one, but it's my first stab). You can actually ignore the $P(H) = p$ and WLOG assume $P(H)=1/2$. Why? There exists a clever algorithm to generate a unbiased coin flip from two biased coin flips. So we can assume $P(H)=1/2$.
To generate a $\text{Bernoulli}(\frac{a}{b})$, I can think of two solutions (the first is not my own, but the second is a generalization):
Flip the unbiased coin $b$ times. If $a$ heads are not present, start over. If $a$ heads are present, return whether the first coin is a heads or not (because $P(\text{first coin is heads | $a$ heads in $b$ coins}) = \frac{a}{b}$)
This can be extended to any value of $\text{Bernoulli}(p)$. Write $p$ in binary form. For example, $0.1 = 0.0001100110011001100110011... \text{base 2}$
We'll create a new binary number using coin flips. Start with $0.$, and add digits depending on if a heads (1) or tails (0) appears. At each flip, compare your new binary number with the binary representation of $p$ up to the same digit. Eventually the two will diverge, and return if $bin(p)$ is greater than your binary number.
In Python:
def simulate(p):
binary_p = float_to_binary(p)
binary_string = '0.'
index = 3
while True:
binary_string += '0' if random.random() < 0.5 else '1'
if binary_string != binary_p[:index]:
return binary_string < binary_p[:index]
index += 1
Some proof:
np.mean([simulate(0.4) for i in range(10000)])
is about 0.4 (not fast however)
I see a simple solution, but no doubt there are many ways to do it, some presumably simpler than this. This approach can be broken down into two steps:
Generating from two events with equal probability given an unfair coin-tossing procedure (the combination of the particular coin and the method by which it is tossed generating a head with probability $p$). We can call these two equally probable events $H^*$, and $T^*$. [There's a simple approach for this that requires taking pairs of tosses $H^*=(H,T)$ and $T^*=(T,H)$ to produce two equally-likely outcomes, with all other outcomes leading to generating a new pair of rolls to try again.]
Now you generate a random walk with two absorbing states using the simulated fair coin. By choosing the distance of the absorbing states from the origin (one above and one below it), you can set the chance of absorption by say the upper absorbing state to be a desired ratio of integers. Specifically, if you place the upper absorbing barrier at $a$ and the lower one at $-(b-a)$ (and start the process from the origin), and run the random walk until absorption, the probability of absorption at the upper barrier is $\frac{a}{a+(b-a)} = \frac{a}{b}$.
(There's some calculations to be done here to show it, but you can get the probabilities out fairly easily by working with recurrence relations ... or you can do it by summing infinite series ... or there are other ways.)