3

The lectures statistics I followed also presented the Poisson distribution. We were taught that the number of events occurring in a time interval, that this statistic follows a Poisson distribution.

$ \begin{split} P(events\; in\; time\; interval) = e^{-\lambda} \frac{\lambda^k}{k!} \end{split} $

The context of this stochastic process is that of queuing - events that occur at random moments and have to wait before being processed.

Initial question: What is the deeper causal-stochastic reason why the Poisson distribution naturally occurs? An elaborate derivation of the stochastics behind Poisson processes can be found here.

But what happens when the assumptions of homogeneity and independence do not hold, across the timeline? Visits to many commercial websites tend to be centered around noon/afternoon. The number of visits does not follow one Poisson distribution over the day. The one-parameter Poisson distribution is too simple for these real-world scenarios.

Just a note: The log-normal distribution, for example, has the two statistics mean and standard deviation, whereas the Poisson distribution has only its mean $\lambda$.

I'm am aware that I push towards a fundament in statistics which has its historic roots more than a century ago. Thanks for any comments and answers.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Match Maker EE
  • 1,701
  • 4
  • 15
  • In an M/M/1 queue, customers arrive so that the period between arrivals are Exp(rate=λ). This implies that the total number of arrivals in a given interval is Poisson. Also, the server spends an exponentially distributed time with each customer (with _rate_ often denoted μ.) The exponential dist'n has the 'memoryless' property (it doesn't matter how long since the last arrival, starting at the present moment, the remaining time for the next arrival is still Exp(rate=λ). This makes modeling easy, even if not exactly realistic. – BruceET Aug 07 '20 at 07:18
  • The M's in the notation stand for _memoryless_ (or Markov). Similarly, service times are not exactly exponential, but a reasonable imitation of actual behavior in many instances. So the Poisson dist'n for intervals comes from the assumption that interarrival and service times are exponential. The exponential assumption is partly based on reality and partly based on mathematical convenience and convenience for simulation. (In instances where exponential is not realistic other distributions are used. Then the M's in the notation are replaced by G's (for _general_) as appropriate – BruceET Aug 07 '20 at 07:19
  • "The exponential assumption is partly based on reality and partly based on mathematical convenience and convenience for simulation". So the deeper question comes down to the truth of the exponential distribution for the interarrival times. Stochastic causality behind the Poisson process relates to the (would-be) exponentially distributed timestamps. The deeper truth of this mechanism is then what this question comes down to. – Match Maker EE Aug 07 '20 at 13:04
  • 2
    Perhaps the analysis I posted at https://stats.stackexchange.com/a/215253/919 (in response to a similar question) contains what you're looking for? – whuber Aug 07 '20 at 13:31
  • 1
    Thanks for your contribution @whuber. It comes down to the statement of independence between consecutieve time intervals (Eq. (2) in the answer you link to), and the subsequent solution to the differential equation there. It will be too simple a model for a number of applications. I'm also sceptic about the simplistic 'one-parameter-model' that results from these derivations, namely the lambda parametrized Poisson distribution. – Match Maker EE Aug 07 '20 at 14:52
  • 2
    Your comment "it will be too simple a model" is so different from your question that it makes me wonder what question you are trying to ask. As far as simplicity goes, see https://stats.stackexchange.com/questions/129322 for a detailed example of how one builds complex models out of simple processes. Also see [our posts on inhomogeneous Poisson processes](https://stats.stackexchange.com/search?q=inhomogeneous+poisson+process) for the natural next step of making this model more flexible. – whuber Aug 07 '20 at 14:57
  • I saw a point in your comment, and made my question more specific. Thanks for your contributions via this discussion. – Match Maker EE Aug 07 '20 at 16:02
  • 1
    You now seem to asking two opposite questions: your title asks why Poisson processes arise but the text now asks what happens when the conditions for a Poisson model don't hold. Please make your post consistent. – whuber Aug 07 '20 at 17:32
  • Title of this question is also modified now. – Match Maker EE Aug 09 '20 at 14:52
  • A paper looking at queeing and the lognormal distribution: https://hal.archives-ouvertes.fr/hal-01891760/document – kjetil b halvorsen Oct 26 '21 at 20:49

0 Answers0