1

This may be a bit more of an advanced algebra question, but here goes:

I'm trying to use the equation for a geometric distribution P(X) = p(1-p)^k to find the probability of an individual trial p when P(X=k) is known for some given k.

As an example stated in practical terms, let's say I know that every day 10% of the people in a hospital die. I wish to simulate the operation of the hospital using time steps of 1 minute. What are the probabilities of each person dying at the resolution of 1 minute which results in a rate of 10% per day?

Brandon Kohn
  • 133
  • 4
  • Your example really calls for using an Exponential distribution for the solution. So--are you interested in the solution to your example or in the abstract question you have posed about the Geometric distribution? – whuber May 20 '16 at 19:18
  • I'm interested in the answer to the discrete distribution. – Brandon Kohn May 20 '16 at 19:32
  • The way I look at it is that the simulation discretizes the day into a number of 'trials' for each person which is a function of the resolution. – Brandon Kohn May 20 '16 at 19:38
  • Perhaps--but you will work extremely hard to solve an equation of degree $1441$ compared to computing a logarithm that gives you the same answer to a very great number of significant figures! Another issue is this: do you really need to use a resolution of one minute, or is your intent to track the deaths precisely? If it's the latter, then the death events should drive your simulation rather than the clock: it's simpler and more efficient. I posted a working example at http://stats.stackexchange.com/a/129786/919: the code for `initialize` shows how the events ("customers") are generated. – whuber May 20 '16 at 20:36
  • Right, I see what you mean. I'm not very well versed in statistics... so .. learning. The formulation of the problem is just a story problem which has characteristics of the solution I require (which are set). Thanks for the link, I'll have a look to see if it helps me. – Brandon Kohn May 20 '16 at 20:38

1 Answers1

2

Your question is somewhat problematic, because for a given probability $$\Pr[X = k] = p(1-p)^k$$ and a given $k$, there are in general two distinct solutions $p \in (0,1)$ that will satisfy the condition. Only when $p = (1+k)^{-1}$ (corresponding to the probability $\Pr[X = k] = k^k (1+k)^{-(1+k)}$) will you have a unique solution. To see why, observe that on $p \in (0,1)$, the derivative of the probability mass function with respect to $p$ is $$\frac{d}{dp}\left[p(1-p)^k\right] = -(kp+p-1)(1-p)^{k-1}.$$ The term $(1-p)^{k-1}$ is positive for all $k \in \mathbb Z^+$. The other factor is zero whenever $p = (k+1)^{-1}$ as stated previously, corresponding to a global maximum on $(0,1)$; as this factor $kp+p-1$ is a linear function of $p$, it therefore demonstrates that $\Pr[X = k]$ is increasing for $0 < p < 1/(1+k)$, and decreasing for $1/(1+k) < p < 1$. Hence there will be in general one solution in each such interval.

For example, suppose we are given $X \sim \operatorname{Geometric}(p)$ and $\Pr[X = 5] = 1/20$. Then we can numerically verify that $$p \approx \{0.0730689462638205, 0.302187628494614\}$$ both work. A plot of the resulting PMFs are shown for $k \in \{1, \ldots, 15\}$:

enter image description here

heropup
  • 5,006
  • 1
  • 16
  • 25