9

I am interested in how uncertainty can be accounted for when considering the risk of extinction of a species. Forgive me for extending a rather tired thought experiment, but at least it's familiar territory and I hope it illustrates what I am trying to understand.

Let's say that Schrödinger was not satisfied with killing and not killing only one cat, so he went out and collected the last 15 remaining Himalayan Snow-Cats. He put each one in a box with a vial of poison, hammer, and a trigger device for releasing the hammer. Each trigger device has a known probability of releasing the hammer within any given hour, and the poison will take 5 hours to kill a cat (things got a bit trippy when he used radioactive decay, so this time he steers clear of quantum mechanics). After one hour, Schrödinger receives a notice from the ethics board telling him he's nuts and ordering him to release the cats immediately. He pushes a button which opens a cat-flap at the back of each box, thus releasing them all back into the wild. By the time the humane society check the boxes, all the cats have bolted. Before anyone can restrain him, Schrödinger detonates the remaining triggers, so nobody knows how many of the cats were poisoned.

The probability that each of the cats is poisoned is as follows:

  1. 0.17
  2. 0.46
  3. 0.62
  4. 0.08
  5. 0.40
  6. 0.76
  7. 0.03
  8. 0.47
  9. 0.53
  10. 0.32
  11. 0.21
  12. 0.85
  13. 0.31
  14. 0.38
  15. 0.69

After 5 hours have elapsed and all poisoned cats have died, what is the probability that there are:

  • (A) More than 10 Himalayan Snow-Cats still alive
  • (B) 10 or less still alive
  • (C) 5 or less still alive
  • (D) 2 or less still alive
whuber
  • 281,159
  • 54
  • 637
  • 1,101
rudivonstaden
  • 209
  • 2
  • 5
  • 1
    This seems like homework. What have you tried so far? – Dason Oct 26 '12 at 15:40
  • 2
    No, not homework, just how I generalised a particular real-world problem. It would be more complicated to explain the real scenario. As I see it, step 1 is to work out the probability of there being 15,14,...,1,0 cats still alive. In step 2, P(Least Concern) = P(15 alive) + P(14 alive) + ... +P (10 alive). Similarly for the other categories. Extremes of step 1 are easy: P(all dead) = P(C1 dead).P(C2 dead)...P(C15 dead) = 4.09x10^-8; P(all alive) = (1-P(C1 dead)).(1-P(C2 dead)...(1-P(C15)) dead = 5.826x10^-5. It's the combinatorics that I'm trying to figure out at the moment. – rudivonstaden Oct 26 '12 at 16:00
  • For completeness, the real problem is that many plant species have a limited number of historical collections, and there is some uncertainty over the exact site of these collections. So we may have a point with a confidence radius of, say 5km. By looking at land transformation within that radius, we can say that a certain percentage of that land is no longer suitable habitat, but we can't say for sure whether the plant still occurs there or not. The transformation percentage can then be used to estimate the probability that the historical population is now extinct. – rudivonstaden Oct 26 '12 at 16:20
  • 1
    Given that all probabilities are different, I would approach this by simulation. Produce say 10,000 universes with Schrodinger and 15 cats in each, have the random cats die in each of these, and obtain the sampling distribution of your overdispersed binomial(15). – StasK Oct 26 '12 at 16:40
  • Simulation is definitely an option, but since I would like to build this into an automated process (and preferably fast), I was hoping I could reduce this to a formula somehow. Given that the probabilities are different, that may not be possible? – rudivonstaden Oct 26 '12 at 17:20
  • 1
    @StasK Why simulate, when there are only $2^{15}$ subsets to work with? In fact, by using the FFT to expand $\prod_{z \in \{0.17, \ldots, 0.69\}} (1-z+z t)$ in powers of $t$, the computation is practically instantaneous. – whuber Oct 26 '12 at 17:34
  • @rudivonstaden +1 for an excellent example of how to transfer a problem from one content domain to another for the purpose of understanding. I will be using this with my Bio statistics students this spring as we discuss the problem of making appropriate assumptions and simplifications. This is always one of the most difficult parts of building the mathematical model. – R. Schumacher Oct 26 '12 at 20:10
  • http://stats.stackexchange.com/questions/9510/probability-distribution-for-different-probabilities is essentially a duplicate of this question. – whuber Dec 02 '13 at 17:08

2 Answers2

19

It's unclear what "risk of extinction category" means, but it appears the question asks to compute the distribution of the sum of 15 independent binomial variates having the given expectations. This is a convolution and it's most efficiently done with the Fast Fourier Transform.

Here is an example in R, whose convolve function uses the FFT:

x <- c(0.17,0.46,0.62,0.08,0.40,0.76,0.03,0.47,0.53,0.32,0.21,0.85,0.31,0.38,0.69)
z <- 1
for (u in sort(x)) z <- convolve(z, c(u, 1-u), type="open")
z

[1] 5.826e-05 1.069e-03 8.233e-03 3.566e-02 9.775e-02 1.805e-01 2.324e-01 2.128e-01

[9] 1.395e-01 6.520e-02 2.142e-02 4.800e-03 6.979e-04 6.039e-05 2.647e-06 4.091e-08

As a check, in Mathematica the same results were obtained with

Product[1 - z + z t, {z, x}] // Expand

$4.091095\times 10^{-8} t^{15}+2.647052\times 10^{-6} t^{14}+\cdots+0.0010688 t+0.0000582614$

Now, for instance, the chance of $10$ or fewer poisonings is computed in R as

sum(z[1:11])

[1] 0.9944

Edit

Using R's convolve function is inefficient for larger problems, because it repeatedly performs an FFT and its inverse. It turns out that the direct algorithm for convolution--as described by StasK--is plenty fast enough. Here is an R implementation.

convolve.binomial <- function(p) {
  # p is a vector of probabilities of Bernoulli distributions.
  # The convolution of these distributions is returned as a vector
  # `z` where z[i] is the probability of i-1, i=1, 2, ..., length(p)+1.
  n <- length(p) + 1
  z <- c(1, rep(0, n-1))
  for (p in p) z <- (1-p)*z + p*c(0, z[-n])
  return(z)
}

(Thanks to an anonymous editor suggesting improvements that clarify the algorithm.)

This takes $O(n^2)$ time for $n$ distributions--the quadratic behavior is not good--but it's still quite fast. As an example, let's generate 10,000 random probabilities (instead of using 15 given ones) and form the convolution of the corresponding Bernoulli distributions:

x <- runif(10000)
system.time(y <- convolve.binomial(x))

This still takes less than 3 seconds.

whuber
  • 281,159
  • 54
  • 637
  • 1,101
  • Thanks for your elegant solution, @whuber! You were correct in your understanding of "risk of extinction category"; since it could potentially be confusing and doesn't add anything to the question, I have removed those references. – rudivonstaden Oct 26 '12 at 18:13
  • I am accepting this answer because I will be doing this analysis from within the QGIS mapping application, which integrates easily with R. – rudivonstaden Oct 26 '12 at 19:33
  • You shouldn't accept an answer on that basis! @StasK's algorithm is easily implemented in `R`. I added code to show how. – whuber Oct 27 '12 at 20:22
  • 2
    I would accept both answers if I could as both answer the question comprehensively! Your answer was more immediately useful, while @StasK's solution was more instructive. On that basis I will select his answer as the most relevant to the question. But I appreciate the time and insight that you have both contributed. – rudivonstaden Oct 27 '12 at 22:28
  • This lends itself well to exploring consequences of different aspects of the set of probabilities; experimentation shows that the order of the probabilities has no effect. I'm not sure if that is immediately obvious, but now I suspect a proof by induction should be possible. – DWin Dec 02 '13 at 19:21
  • @DWin The order of the probabilities has a small effect on finite-precision floating-point calculations (for well-known reasons), but it has no effect on the exact results. This is made obvious by considering that the characteristic function of the resulting distribution is the product of the cfs of the Bernoulli distributions: because the order of multiplication in the product does not matter and the cf determines the distribution, the result is the same regardless of the order. The floating point errors are typically about as small as can be hoped, less than $10^{-16}$. – whuber Dec 02 '13 at 19:39
  • 1
    Thanks from an old guy who never took college level probability courses. I've tried learning this stuff from textbooks, but having the capacity to play around with it in R makes it more immediate. I will try to link up this experience with my copies of Feller's wonderful volumes. – DWin Dec 02 '13 at 19:43
  • I am trying to understand [your code](http://stackoverflow.com/q/39809756/4089351) to see exactly the mechanics of the calculation of the convolution. – Antoni Parellada Oct 01 '16 at 18:00
  • 1
    @Antoni, each step adds a Bernoulli$(q)$ variable to the discrete variable whose probability generating function (pgf) is given in the variable `z`. If you like, think of `z` as representing the polynomial $z(t)=z_1+z_2t+z_3t^2+\cdots+z_nt^{n-1}$. The pgf of the Bernoulli variable is $f_q(t)=(1-q)+qt$. Each step of `sapply` computes an update $z(t)\to z(t)f_q(t)$. The initial value is $z(t)=1$. Thus, given the vector $\mathbf{p}=(p[1],p[2],\ldots,p[n-1])$, the loop computes the product $$\prod_{i=1}^{n-1}f_{p[i]}(t),$$ which is a polynomial of degree $n-1$ (therefore with $n$ coefficients). – whuber Oct 01 '16 at 20:40
  • Clothesline with $z_1$ being the $\Pr(\text{zero poisonings})$. And the $t$ in the $qt$ part of the pdf of each new Bernouilli $p_i$ modifying one term offset of the polynomial... So clear now... – Antoni Parellada Oct 02 '16 at 05:01
  • FWIW, my preceding comment may be confusing in its reference to `sapply`: that construct has since been replaced by a one-line `for` loop. The algorithm, and its explanation, remain the same. – whuber Feb 22 '18 at 15:36
11

Let $p_k, k=1, \ldots, K=15$ be the survival probabilities for individual snow cats.

  1. Initialize the number of cats accounted for so far $k \leftarrow 0$, the vector of the binomial survival probabilities $(\pi_0^{(0)}, \pi_1^{(0)}, \ldots, \pi_K^{(0)}) \leftarrow (1, 0, \ldots, 0, 0)$
  2. While there still are unaccounted cats, $k \le K$, repeat steps 3-5:
  3. Increase $k \leftarrow k+1$
  4. Update the 0 outcome (death): $\pi_j^{(k)} \leftarrow \pi_j^{(k-1)} (1-p_k), j=0, \ldots, K$
  5. Update the 1 outcome (survival): $\pi_{j+1}^{(k)} \leftarrow \pi_{j+1}^{(k)} + \pi_j^{(k-1)} p_k, j=0, \ldots, K$

I would probably be paranoid about accounting for everything, and make sure that my probabilities still sum up to 1 at each iteration after step 5.

My results are:

4.091e-08  
2.647e-06  
.00006039  
.00069791  
.00479963  
.02141555  
.06519699  
.13945642  
.21276277  
.23238555  
.18045155  
.09775029  
.03565983  
.00823336  
.0010688  
.00005826  

Prob for # of survivors

The sum of the first three is the prob of being critically endangered, the last 5, the prob of least concern, etc.

StasK
  • 29,235
  • 2
  • 80
  • 165
  • Thank you, @StasK, this also looks like a good solution. +1 for the algorithm. – rudivonstaden Oct 26 '12 at 18:22
  • 1
    (+1) This is identical to the convolution solution I posted, except the convolutions are performed directly rather than with a built-in convolution function. There's little reason to prefer one method over the other for 15 probabilities, but with, say, 150000 of them, you might want to look at using an FFT. I had the same suspicions as you about numerical error creeping in, which is why I did the *Mathematica* comparison: it used exact arithmetic and its answers--fractions with large numerators and denominators--were converted to machine precision only at the end. The agreement was great. – whuber Oct 26 '12 at 18:59