3

I have a mean $\mu$ and a variance $\sigma^2$ with underdispersion, i.e., $\sigma^2<\mu$. Is there a standard discrete distribution with these moments and unbounded-on-the-right support, i.e., support on $\{0, 1, \dots\}$?

Bonus points if it is implemented in R.


  • I looked at the , but that is only defined if the size parameter $\frac{\mu}{1-\frac{\sigma^2}{\mu}}$ is an integer.
  • The binomial and binomial compounds like the have bounded support.
  • So does the Generalized Poisson distribution (Consul & Jain, 1973) in the case of underdispersion, plus it can only handle underdispersion to a certain degree (note that Consul & Jain require $|\lambda_2|<1$ in formula (3.1)). The Generalized Poisson is Joseph Hilbe's main recommendation in this answer of his. His other recommendations might be useful, but he gives no details on them, and searching for the names is not very successful.
  • Sampling from under/over-dispersed count data in R is related but does not have a helpful answer.
  • Quasi-Poisson models sound like they may be useful (e.g., here), but I haven't been able to find anything helpful outside the context of a regression.
Stephan Kolassa
  • 95,027
  • 13
  • 197
  • 357
  • 2
    Hilbe's answer to this question https://stats.stackexchange.com/questions/67385/what-is-the-appropriate-model-for-underdispersed-count-data might be helpful. – jbowman Oct 17 '18 at 16:27
  • Not familiar with applications of this, but how about $\sqrt{X},$ where $X \sim \mathsf{Pois}(\lambda)?$ `set.seed(1017); x=rpois(10^5,3); mean(sqrt(x)); var(sqrt(x))` returns $1.62883 > 0.3404474.$ Round or take floor if you need integers. – BruceET Oct 17 '18 at 17:33
  • 1
    @jbowman: Hilbe's main suggestion is the Generalized Poisson, which has bounded support if underdispersed. (I did find his answer when writing this question, upvoted it and looked through his *Negative Binomial Regression*.) His other suggestions might be helpful if there were just a few pointers to literature. – Stephan Kolassa Oct 17 '18 at 17:34
  • @BruceET: thanks, but that presupposes a very specific relationship between the mean and the variance, and I'd like for something that works for general moments. – Stephan Kolassa Oct 17 '18 at 17:35
  • 2
    The Conway-Maxwell-Poisson distribution has unbounded support and R packages, including one for regression, so may fit your requirements: https://en.wikipedia.org/wiki/Conway%E2%80%93Maxwell%E2%80%93Poisson_distribution. When I looked at it some years ago, computation was very slow, but it looks like much faster algorithms have been developed since then. – jbowman Oct 17 '18 at 17:50
  • @jbowman: thanks, that looks very helpful indeed! Do you want to post your comment as an answer? – Stephan Kolassa Oct 17 '18 at 17:55
  • @Stephan 1. When you say "unbounded" do you mean unbounded on both the left and right, or only on the right? I'd have thought you meant the first but several aspects of your question/comments suggest otherwise. 2. If you want an actual exponential-family quasi-Poisson distribution then I think it's just a scaled-Poisson. If that's underdispersed, it will be discrete but can't be restricted to be on the integers. – Glen_b Oct 18 '18 at 00:58
  • @Glen_b: 1. Only on the right. (I'd be interested in which of my comments suggest otherwise. Am I confused?) I edited the question to clarify. 2. I don't need an actual exponential family distribution, anything is fine. – Stephan Kolassa Oct 18 '18 at 07:07
  • @Stephan I initially interpreted "unbounded support" to imply unbounded in both directions (and likely would again in the same circumstances, unless there were other information); your comments and some aspects of your question suggested otherwise (i.e. that it was more like a count, and bounded on one side, rather than simply discrete and completely without bounds). I'm not criticizing the question, simply trying to be sure I understood the intent properly. – Glen_b Oct 18 '18 at 15:26
  • There are many possible answers. The most trivial is to add a sufficiently large integer to any lattice variable, thereby increasing $\mu$ without changing $\sigma^2.$ Even restricting distributions to those implemented in `R`, you could (for instance) reduce the dispersion by using $\lfloor X/k \rfloor$ for $k\gt 1$ with $X$ any non-negative variable (discrete or not). As in almost all such cases, it's likely more constructive to articulate the *statistical* problem you are trying to solve so that a suitable choice of distribution (family) can be made. – whuber Jan 18 '21 at 20:00
  • @whuber: I am (was) modeling retail sales. These are low volume count data, often intermittent (i.e., many zeros). The majority of such time series is overdispersed, and a negative binomial distribution makes sense. Sometimes, they turn out to be underdispersed, and I wonder what distribution would reasonably describe them. I have played around a bit with the Conway-Maxwell-Poisson per jbowman's answer, but if there is anything else, I would be interested. – Stephan Kolassa Jan 18 '21 at 22:36
  • The start of any effective search for a distribution to model your data ought to be with an analysis of the empirical distribution of those data (which may, and ideally ought, to include any information gleaned from an understanding of what those data mean and how they were generated). Without any such information as a guide, there's nothing to say beyond outlining the purely mathematical description of the set of all such distributions. – whuber Jan 19 '21 at 14:08

1 Answers1

5

The Conway-Maxwell-Poisson distribution (https://en.wikipedia.org/wiki/Conway%E2%80%93Maxwell%E2%80%93Poisson_distribution) has unbounded support (on the right) and can model both under- and over-dispersion (relative to the Poisson) seamlessly through the use of a single parameter. The Poisson is a special case. It can't handle any amount of underdispersion, though, and it is relatively computationally intensive. It is a member of the exponential family of distributions.

R packages exist for both estimation and regression:

https://cran.r-project.org/web/packages/CompGLM/CompGLM.pdf

https://cran.r-project.org/web/packages/COMPoissonReg/COMPoissonReg.pdf

https://cran.r-project.org/web/packages/compoisson/compoisson.pdf

although I haven't used them so cannot make any helpful comments about their relative quality / usefulness!

jbowman
  • 31,550
  • 8
  • 54
  • 107