Distribution of random variable with multinomial sampling distribution and parameters $(n,p)$, where $n\sim$ Poisson with truncation

Question

Suppose you have:

$$X\mid N\sim\text{MN}(N,p_1,p_2,\ldots,p_{J})$$

$$N\sim \text{Poisson}(\lambda)$$

What is the marginal distribution of $X$? In this case, the answer is simply this.

But...

Suppose further that the probabilities are:

$$p_j=\frac{\exp(V_{j})}{1+\sum_{k=1}^J\exp\left(V_{k}\right)}$$

However, what if also we have a vector of maximum quantities for each component $1,\ldots,J$.

For example,

$$K=\{1,1,2,..,3\}$$

Meaning after the first unit is gone (because $K_1=1$), it is removed from the choice set and no more can be sold, which means the multinomial choice probabilities change to:

$$p_j=\dfrac{\exp(V_{j})}{1+\sum_{k\neq1} ^J\exp\left(V_{k}\right)} \quad\forall\, j\neq1$$

Is there an analytic expression for the conditional distribution of $X \mid K$? If not, what would be an efficient way to simulate this?

Ben · Accepted Answer · 2019-04-24T23:10:41.420

At the moment your model is not clearly defined, so it is premature to seek the distribution of the count vector $X$. You are seeking to generalise from the multinomial distribution, which is the distribution of count values over categories for an underlying sequence of independent categorical random variables with a specified probability vector. In your generalisation you say you want to impose maximums on each of the counts, but you also want to keep the specified probabilities used in the multinomial distribution. (You also mis-specify the form of the probabilities; with your formula the elements of the probability vector don't add to one.)

In order to make your model well-defined you will need to fix the form of your probability vector so that its elements add to one. (Presumably you meant to have an additional element $p_0 = 1/(1+\sum_k \exp(v_k))$.) Let $\mathbf{k} = (k_0,...,k_J)$ be the imposed maximum counts across the categories, and let $\dot{k} \equiv \sum_j k_j$ be the total maximum count. (I will use lower case here since this is assumed to be fixed rather than random.) You then have a sequence of values $W_1,...,W_{\dot{k}}$ such that:

$$\mathbb{P}(W_i = j|W_1,...,W_{i-1}) = \frac{\exp(v_j) \cdot I_{i,j}}{I_{i,0} + \sum_j \exp(v_j) \cdot I_{i,j}},$$

where $X_j(i) \equiv \sum_{r=1}^{i} \mathbb{I}(W_i = j)$ are the category counts after the $i$th value is selected, and the indicator $I_{i,j}$ is defined as $I_{i,j} \equiv \mathbb{I} ( X_j(i) < k_j )$. Notably, after observing $\dot{k}$ values from this model, the category counts must necessarily be at the maximums, so $\mathbf{X}(\dot{k}) = \mathbf{k}$.

This is a complicated generalisation of the multinomial model, with the latter occurring in the special case where $\mathbf{k} = (\infty,...,\infty)$, so that there are no maximums on the category counts.

Writing the distribution of $\mathbf{X}$: The count vector $\mathbf{X}$ is a function of the underlying categorical variables $W_1,...,W_N$. Denote this function by $T:\mathbf{W} \mapsto \mathbf{X}$. For any given $N \leqslant \dot{k}$ we define the set of $N$ category vectors:

$$\mathscr{S}(\mathbf{x}, \mathbf{k}) \equiv \Bigg\{ (w_1,...,w_N) \in \{ 0,...,J \}^N \Bigg| T(\mathbf{w}) = \mathbf{x} \leqslant \mathbf{k} \Bigg\}.$$

The set $$ gives us the set of all vectors $w_1,...,w_N$ that lead to an admissible count vector $\mathbf{x}$ (admissible in the sense that none of the counts are above the specified maximum).

$$\begin{equation} \begin{aligned} \mathbb{P}(\mathbf{X} = \mathbf{x}) &= \mathbb{P}(\mathbf{W} \in \mathscr{S}(\mathbf{x}, \mathbf{k})) \\[6pt] &= \sum_{\mathbb{w} \in \mathscr{S}(\mathbf{x}, \mathbf{k})} \mathbb{P}(\mathbf{W} = \mathbf{w}) \\[6pt] &= \sum_{\mathbb{w} \in \mathscr{S}(\mathbf{x}, \mathbf{k})} \prod_{i=1}^N \mathbb{P}(W_i = w_i|W_1 = w_1,...,W_{i-1} = w_{i-1}) \\[6pt] &= \sum_{\mathbb{w} \in \mathscr{S}(\mathbf{x}, \mathbf{k})} \prod_{i=1}^N \frac{\exp(v_{w_i}) \cdot I_{i,{w_i}}}{I_{i,0} + \sum_j \exp(v_j) \cdot I_{i,j}}. \\[6pt] \end{aligned} \end{equation}$$

As you can see, this is a horrendously complicated expression. Unless $N$ and $J$ are both very small, the summation in this expression will get very complicated indeed. Technically this is a closed form expression, since it is a finite sum of terms involving elementary operations. However, the expression is complicated heavy by the fact that it involves a sum over a combinatorial set that is hard to construct, and the summand involves a product of terms that each key of counts from the previous elements of the category vector.

I am unaware of any way this can be simplified further. In view of this, there does not appear to me to be any simple closed form expression for the distribution of the count vector $\mathbf{X}(N)$ (except for the trivial case where $N=\dot{k}$). You would therefore need to generate the distribution of the count vector by simulation from the model. It is unlikely that there can be any efficiency gain beyond simulating the underlying values in the model from the above categorical distributions.

Hi Ben, thanks for your reply. Fair enough, wrt the outside option and probabilities summing to 1 (just carelessness on my part, I apologize if it made it unclear!) Nevertheless, how do you know there is no closed form expression for the distribution of the count vector? — wolfsatthedoor, Apr 24 '19 at 20:59
I have now added an additional section showing an expression for the distribution of the count vector, and explaining why I think it has no simple form. — Ben, Apr 24 '19 at 23:11

Distribution of random variable with multinomial sampling distribution and parameters $(n,p)$, where $n\sim$ Poisson with truncation

1 Answers1