0

I would like to write a python (or R, etc) function that computes the likelihood of two events happening at the same millisecond. The events are independent an can happen at any time of the day. My goal is to be able to say "when there are 1.000.000 events a day, the likelihood of both happening on the same millisecond is X%".

I've read a bit about the Poisson distribution and the python function random.expovariate that could be used to predict likelihood of events based on the Poisson distribution. But to my uneducated mind it is not clear whether the Poisson distribution is the right one.

Could it maybe even be as simple as the following?

likelihood=(1/86400000)*(1/86400000)*number_of_events
Arian
  • 1
  • 1
    By "likelihood" do you mean "probability" -- they are *different* things, so it's unclear what do you mean (check http://stats.stackexchange.com/questions/2641/what-is-the-difference-between-likelihood-and-probability ) ..? Is it about probability of two or 1000000 events? – Tim May 23 '16 at 08:33
  • A technical hint: when having such small numbers like `(1/86400000)*(1/86400000)`, you have to take care about [the machine epsilon](https://en.wikipedia.org/wiki/Machine_epsilon) (for R see http://stackoverflow.com/questions/2619543/how-do-i-obtain-the-machine-epsilon-in-r). – Qaswed May 23 '16 at 09:17

1 Answers1

3

The phrasing of your question is a bit ambiguous, so I'm going to interpret it like this: "The expected number of events per day is 1,000,000. Given that an event has just occurred, what's the probability that the next event occurs within 1ms?"

We can consider event times to be generated by a homogeneous Poisson process. This means that they're independent, and there's a parameter $\lambda$ that gives the expected number of events per unit time. $\lambda$ itself doesn't change over time in this model (if you want it to, you'd need an inhomogeneous Poisson process).

In your example, $\lambda$ = 1e6/8.64e7 (events per ms) = (events/day)*(days/ms)

For a homogeneous Poisson process, the time between successive events (let's call it the inter-event interval) has an exponential distribution with mean $1 / \lambda$. In your example, this gives an average of 86.4ms between events.

Given that an event has just occurred at time $t_0$, we want to calculate the probability that the next event occurs at time $t_1 \le t_0 + 1$ ms. That is, the inter-event interval will be between 0 and 1ms. To do that, we can integrate the probability density function (PDF) of the inter-event interval from 0 to 1ms. This is the same as evaluating its cumulative distribution function (CDF) at 1ms. The CDF of the exponential distribution is:

$$1 - e^{-\lambda t}$$

Evaluating this at $\lambda$=1e6/8.64e7 and $t$=1ms gives a probability of ~0.0115

user20160
  • 29,014
  • 3
  • 60
  • 99