The negative binomial distribution has become a popular model for count data (specifically the expected number of sequencing reads within a given region of the genome from a given experiment) in bioinformatics. Explanations vary:
- Some explain it as something that works like the Poisson distribution but has an additional parameter, allowing more freedom to model the true distribution, with a variance not necessarily equal to the mean
- Some explain it as a weighted mixture of Poisson distributions (with a gamma mixing distribution on the Poisson parameter)
Is there a way to square these rationales with the traditional definition of a negative binomial distribution as modeling the number of successes of Bernoulli trials before seeing a certain number of failures? Or should I just think of it as a happy coincidence that a weighted mixture of Poisson distributions with a gamma mixing distribution has the same probability mass function as the negative binomial?