1

I have a problem that I was able to reduce to the following statement:

What is the probability of throwing at least k ones when throwing n fair six-sided dice and m fair eight-sided dice?

Background information can be found at this Math.SE question if you are interested.

After initially not finding anything when googling I finally found out about the Poisson Binomial Distribution. Now I recently graduated and it's been two years since I've done my Probability Theory & Statistics course and things are a bit difficult to understand.

I found this CrossValidated.SE question and the relevant Wikipedia page which claim that it is computable but Wikipedia also claims that the method is not numerically stable. This is a property I'd like to avoid.

Now I figured I have a special case here, I only have two different probabilities for my Bernoulli Trials, what is the simplest formula for calculating my problem statement? I'm also interested in the practical implementation as I want to implement this in a programming language such as C#.

skiwi
  • 113
  • 4

2 Answers2

4

[While this is indeed a special case of the Poisson-Binomial I wouldn't approach it that way; it's more general than you want in this case.]

Let $X_6$ be the number of $1$'s on $n$ six-sided dice. Let $X_8$ be the number of $1$'s on $m$ eight-sided dice.

Assume the results on each die are independent of the results on all other dice and that each die (in combination with the die-rolling process) is fair.

Then $X_6\sim \text{Bin}(n,\frac16)$ is independent of $X_8\sim \text{Bin}(m,\frac18)$. Let $Y=X_6+X_8$. Then

$$P(Y=k)=\sum P(X_6=i)P(X_8=k-i)\,,$$

where the limits of the sum are over the feasible values of $i$.

Consider the smallest that $i$ (the number of six-siders having the value $1$) can be -- if $m\geq k$ (i.e. if we can get the full load of $k$ $1$'s from the eight-sided dice), the smallest value of $i$ will be $0$. Otherwise some of the $k$ $1$'s will have to come on the six-siders -- in that situation $i$ must be no smaller than $k-m$.

Now the largest value of $i$ will be $k$ unless $n$ is too small for that, in which case $n$ will be the most six-siders in the total of $k$.

So the lower and upper limits of the sum are $\max(0,k-m)$ and $\min(n,k)$ respectively (and if the upper limit is not at least as large as the lower limit the sum is simply 0).

Consequently:

$$P(Y=k)\:=\sum_{\max(0,k-m)}^{\min(n,k)} P(X_6=i)P(X_8=k-i)\,.\qquad\qquad\quad (1)$$

Now $P(X_6=i)={n\choose i}\frac16^i\frac56^{n-i}$ and $P(X_8=k-i)={m\choose {k-i}}\frac18^{k-i}\frac78^{m-k+i}$ (since $X_6$ and $X_8$ are simply binomial), so we can write

$$P(Y=k)\:=\sum_{\max(0,k-m)}^{\min(n,k)} {n\choose i}\frac16^i\frac56^{n-i}\cdot {m\choose {k-i}}\frac18^{k-i}\frac78^{m-k+i}\,.\quad (2)$$

While some simplification is possible there (for starters the most obvious simplification is you can pull constant powers out the front and just have $\frac{7}{5}^i$ times the two choose terms), it's probably not worth doing in general (though some special cases may have some value).

If I wanted the probability function for $Y$, I'd be inclined to do the convolution of the two binomials numerically for the specific $m$ and $n$ I needed. This is quite straightforward and can be readily accomplished in something like R or Excel - or indeed C# - for example.

Which is to say that you probably have access to a library that computes binomial probabilities (like $P(X_6=i)$) both rapidly and accurately, and that once you have the vector of probabilities for $X_6$ and $X_8$, you go back to $(1)$ -- you simply "flip" one of the two vectors of $p_i$'s around, compute a shifting sum-of-products (starting from the smallest overlap at one end and moving to the smallest overlap at the other end), make sure you've got the right value-labels for each such sum-of-products and you have the entire probability function. [You probably have some fast inner-product function already available, so even if you don't have an inbuilt convolution-function in some handy library, once you have the binomial probabilities, it's just a loop (for each value of $k$) over an inner-product of pre-computed probabilities.]

If you need this for large $m$ & $n$ there are more efficient approaches (e.g. using fast Fourier transforms) but I expect that you won't have large enough numbers to make it worth worrying about.

Glen_b
  • 257,508
  • 32
  • 553
  • 939
  • $P$ can be expressed in terms of the hypergeometric function. This can be useful when $k$ is largish and high precision is desired, or for additional analysis. – whuber Dec 26 '16 at 14:07
0

I think this should be fairly easy to do in R. Major work performed by the expand.grid function:

n <- 5; m <- 7
P <- expand.grid(append(lapply(1:n, function(x) { c(1/6, 5/6)}), lapply(1:m, function(x) { c(1/8, 7/8)})))
cdf <- lapply(1:(n+m), function(k) {sum(apply(P[apply(P <= 1/6, 1,sum) >= k,], 1, prod))})
print(cdf)