10

I have the following discrete distribution, where $\alpha,\beta$ are known constants:

$$ p(x;\alpha,\beta) = \frac{\text{Beta}(\alpha+1, \beta+x)}{\text{Beta}(\alpha,\beta)} \;\;\;\;\text{for } x = 0,1,2,\dots $$

What are some approaches to efficiently sample from this distribution?

jII
  • 549
  • 3
  • 11

2 Answers2

10

Given that $$\frac{\text{Beta}(\alpha+1, \beta+x)}{\text{Beta}(\alpha,\beta)}=\dfrac{\alpha}{\alpha+\beta+x}\dfrac{\beta+x-1}{\alpha+\beta+x-1}\cdots\dfrac{\beta}{\alpha+\beta}$$ is decreasing with $x$,I suggest generating a uniform variate $u\sim\mathcal{U}(0,1)$ and computing the cumulated sums$$S_k=\sum_{x=0}^k \frac{\text{Beta}(\alpha+1, \beta+x)}{\text{Beta}(\alpha,\beta)}$$ until $$S_k>u$$ The realisation is then equal to the corresponding $k$. Since $$\eqalign{R_x&=\frac{\text{Beta}(\alpha+1, \beta+x)}{\text{Beta}(\alpha,\beta)}\\&=\dfrac{\alpha}{\alpha+\beta+x}\dfrac{\beta+x-1}{\alpha+\beta+x-1}\cdots\dfrac{\beta}{\alpha+\beta}\\&=\frac{\alpha+\beta+x-1}{\alpha+\beta+x}\frac{\beta+x-1}{\alpha+\beta+x-1}R_{x-1}\\&=\frac{\beta+x-1}{\alpha+\beta+x}R_{x-1}}$$and$$S_k=S_k-1+R_k$$the computation can avoid using Gamma functions altogether.

Xi'an
  • 90,397
  • 9
  • 157
  • 575
  • 1
    (+1) Using $S_k = 1-\frac{\Gamma (a+b) \Gamma (b+k+1)}{\Gamma (b) \Gamma (a+b+k+1)}$ will greatly expedite the work. – whuber Dec 30 '15 at 15:52
  • 1
    Re the edit: I suspect exploiting Gamma functions will nevertheless be helpful in solving for $k$ in terms of $u$, $\alpha$, and $\beta$. For instance, one could find an initial approximation to $u$ by using Stirling's Formula in the evaluation of $\Gamma(b+k+1)$ and $\Gamma(a+b+k+1)$ and then polishing that with a few Newton-Raphson steps. Those need evaluation of log Gamma and its derivative. Of course if $\alpha$ and $\beta$ are both integral then the solution is a root of a polynomial--but even then using Gamma may still be the way to go. – whuber Dec 30 '15 at 17:17
  • 1
    Great answer! I accepted the answer given by SL because it brought to my attention a key point (not part of the original question), that sampling from the posterior predictive is equivalent to sampling the parameter from the posterior, then sampling the data from the likelihood. In particular, the distribution function above is the posterior predictive of a Geometric data with Beta prior on the parameter $p$. – jII Dec 30 '15 at 18:05
9

This is a Beta negative binomial distribution, with parameter $r=1$ in your case, using the Wikipedia notation. It also named Beta-Pascal distribution when $r$ is an integer. As you noted in a comment, this is a predictive distribution in the Bayesian negative binomial model with a conjugate Beta prior on the success probability.

Thus you can sample it by sampling a $\text{Beta}(\alpha,\beta)$ variable $u$ and then sampling a negative binomial variable $\text{NB}(r,u)$ (with $r=1$ in your case, that is to say a geometric distribution).

This distribution is implemented in the R package brr. The sampler has name rbeta_nbinom, the pmf has name dbeta_nbinom, etc. The notations are $a=r$, $c=\alpha$, $d=\beta$. Check:

> Alpha <- 2; Beta <- 3
> a <- 1
> all.equal(brr::dbeta_nbinom(0:10, a, Alpha, Beta), beta(Alpha+a, Beta+0:10)/beta(Alpha,Beta))
[1] TRUE

Looking at the code, one can see it actually calls the ghyper (generalized hypergeometric) family of distributions of the SuppDists package:

brr::rbeta_nbinom
function(n, a, c, d){
  rghyper(n, -d, -a, c-1)
}

Ineed, the BNB distribution is known as a type IV generalized hypergeometric distribution. See the help of ghyper in the SuppDists package. I believe this can also be found in Johnson & al's book Univariate Discrete Distributions.

Stéphane Laurent
  • 17,425
  • 5
  • 59
  • 101
  • This answer is great, but it would be even better if you proved that the density OP posted is the same as the negative binomial density. – Sycorax Dec 30 '15 at 20:06
  • 1
    @user777 I think the author of the OP proved it himself, in view of his comment to Xian's answer (posterior predictive distribution in the negative binomial model with a conjugate Beta prior). – Stéphane Laurent Dec 30 '15 at 21:31