1

I have data on $n$ people's dates of birth and let's ignore the years and look only at the $k$ = 366 days of the year (including Feb 29).

Assuming that dates of birth are uniformly and independently distributed over the year, this is similar to uniformly and independently distributing $n$ balls into $k$ bins. So I believe for any particular day, the number of people could be approximated by a Poisson random variable, with $\lambda = \frac{n}{k}$ as the mean.

But what would be a good approximation for the expected the number of people for all 366 days?

While a Poisson random variable is a good approximation for a single bin, simply taking the sum of $k$ independent Poisson random variables would ignore dependencies between the bins. Is there any better way to approximate the distribution?

mloning
  • 408
  • 4
  • 12
  • Even for any single day you could already have more than $n$ individuals in total, even though this has a low probability. – qeschaton Jul 01 '18 at 11:07
  • 1
    This is really an empirical question! and the answer well may be different for different countries. An answer containing data for Norway: https://stats.stackexchange.com/questions/80738/what-is-the-probability-that-a-person-will-die-on-their-birthday/336676#336676 – kjetil b halvorsen Jul 01 '18 at 11:27
  • My questions is more about whether the Poisson distribution is still a good approximation in this case and what other approximation one could use, I've clarified my question accordingly. – mloning Jul 01 '18 at 13:24
  • 3
    As your second paragraph shows, this is a multinomial distribution--there's no need to approximate it. – whuber Jul 01 '18 at 15:11
  • @whuber Yes, that's what I was looking for, thanks! – mloning Jul 02 '18 at 06:07

0 Answers0