5

I'm trying to find probabilities for a scenario that works exactly like hypergeometric distribution, except that every time you make a draw, you replace it with a "failure"-object. So eventually you'd end up with no successes among the population. What I'd like to describe is the probability of a certain number of successes given a certain number of draws.

An example of this situation could be: You're pulling marbles out of an urn. There are green marbles (successes) and white marbles. Everytime you pull out a white marble (failure) you just put it back but every time you pull out a green marble (success) you replace it with a white marble. So the number of marbles remain constant, but the number of successes decreases each time you draw a success.

Do we have a probability distribution to describe this?

Tim
  • 108,699
  • 20
  • 212
  • 390
Kristoffer
  • 53
  • 5
  • 1
    What do you mean by "false" objects? Maybe you could give us an example? Maybe you mean negative hypergeometric distribution https://en.wikipedia.org/wiki/Negative_hypergeometric_distribution – Tim Sep 24 '17 at 16:17
  • Hi Tim, thanks for your comment. I added an example to hopefully make it more clear what I'm asking. I hope it makes more sense now. It is unfortunately not negative hypergeometric distribution which I'm looking for. – Kristoffer Sep 24 '17 at 17:43
  • What exactly is the random variable you want to find the distribution? You described a proccess, but not the variable you're interested in. Is it the total number of green marbles after a fixed number of trials? – Freguglia Sep 25 '17 at 03:00
  • Hi Victor! Yes, that is what I am interested in. I have found an expression for the expected number, but would like to add see some probabilities. – Kristoffer Sep 25 '17 at 06:42

2 Answers2

6

Suppose an urn of $n$ balls begins with $s$ successes. What are the chances it will end up with $t$ successes ($0 \le t \le s$) after $d$ draws?

Ignoring the trivial case $s=0$, this is a Markov chain on the numbers of successes $s$. The chance of a transition from $s$ to $s-1$ is $p(s,s-1)=s/n$; otherwise, the state stays the same: $p(s,s)=1-s/n$.

The $n+1\times n+1$ transition matrix $\mathbb{P} = p(s,t)$ (indexing from $0$ through $n$) readily decomposes as

$$\mathbb P = \mathbb{V}\operatorname{Diagonal}\left(\frac{n}{n}, \frac{n-1}{n}, \ldots, \frac{1}{n}, \frac{0}{n}\right) \mathbb{V}^{-1}$$

where $$\mathbb{V} = (v_{ij}, i=0,\ldots,n; j=0,\ldots, n) = \left(\binom{i}{j}\right)$$

is Pascal's Triangle and $$\mathbb{V}^{-1} = (v^{-1}_{ij}) = \left((-1)^{i+j}\binom{i}{j}\right)$$

contains the same values but with alternating signs.

The chance of a transition from $s$ to $t$ after $d$ steps, $p_n(s,t)$, is found from the $s,t$ entry in $\mathbb{P}^n$ (again indexing from $0$ through $n$). But we easily compute

$$\mathbb{P}^n= \mathbb{V}\operatorname{Diagonal}\left(\frac{n^d}{n^d}, \frac{(n-1)^d}{n^d}, \ldots, \frac{1^d}{n^d}, \frac{0^d}{n^d}\right)\mathbb{V}^{-1}.$$

Consequently, omitting all terms obviously zero,

$$p_d(s,t) = n^{-d}\sum_{j=t}^s \binom{s}{j} (n-j)^d (-1)^{j+t}\binom{j}{t}.$$

![Figure plotting chance against t

These figures plot the chance of reaching an urn with $t$ successes starting with $s=100$ successes in $n=100$ total. The numbers of draws are 10,50,150,and 250. As they increase, the distribution moves to the left more and more slowly, first spreading and then contracting to zero.

In terms of $t$, these are generalized hypergeometric distributions.

whuber
  • 281,159
  • 54
  • 637
  • 1,101
4

There will be a recursion. If $S_{n,g,b}$ is the number of green balls successfully drawn with $n$ attempts starting with $g$ green balls and $b−g$ white balls then

$$\mathbb P (S_{n,g,b} = s)= \frac{g}{b} \mathbb P (S_{n-1,g-1,b} = s-1) + \frac{b-g}{b} \mathbb P (S_{n-1,g,b} = s)$$

starting with $\mathbb P (S_{0,g,b} = 0)= 1$, and with $\mathbb P (S_{n,g,b} = s)= 0$ when $s \gt n$ or $s \gt g$ or $s \lt 0$

For example, with $b=15$ balls in total and $n=5$ attempts, the probability of $s$ successes when starting with $g$ green balls is about:

Probability table with n=5 attempts, starting with g green and b=15 total balls 
(rows sum to 1)

        s (number of successes)                     
        0           1           2           3           4           5           6
g start                             
0       1           0           0           0           0           0           0
1       0.708246    0.291754    0           0           0           0           0
2       0.488946    0.438600    0.072454    0           0           0           0
3       0.327680    0.483797    0.174104    0.014420    0           0           0
4       0.212084    0.462386    0.274015    0.049462    0.002054    0           0
5       0.131687    0.401982    0.352000    0.104691    0.009481    0.000158    0
6       0.077760    0.323563    0.397037    0.174617    0.026074    0.000948    0
7       0.043151    0.242261    0.405689    0.250272    0.055309    0.003319    0
8       0.022133    0.168149    0.380523    0.320790    0.099556    0.008849    0
9       0.010240    0.107034    0.328533    0.374993    0.159289    0.019911    0
10      0.004115    0.061248    0.259556    0.402963    0.232296    0.039822    0
11      0.001348    0.030434    0.184691    0.397630    0.312889    0.073007    0
12      0.000320    0.012342    0.114726    0.356346    0.391111    0.125156    0
13      0.000042    0.003612    0.058548    0.282469    0.451951    0.203378    0
14      0.000001    0.000572    0.021570    0.186943    0.474548    0.316365    0
15      0           0.000020    0.004148    0.089877    0.431407    0.474548    0

I am not aware of a common named distribution like this

Henry
  • 30,848
  • 1
  • 63
  • 107
  • You can solve this recursion explicitly. One way is to represent the situation as a Markov chain. It's a particularly simple one. – whuber Sep 25 '17 at 15:03
  • @whuber: No doubt you are correct, but it did not seem particularly obvious to me in general. Clearly $\mathbb P (S_{n,g,b} = 0) = \frac{(b-g)^n}{b^n}$ and $\mathbb P (S_{n,g,b} = n) = \frac{g!}{(g-n)! \,b^n}$ but it looked more complicated for other values of $s$ – Henry Sep 25 '17 at 15:27