1

Consider a discrete distribution $X$ that is a mixture of two discrete distributions $A$ and $B$. Explicitly, $X=A$ with probability $p$ and $X=B$ with probability $1-p$. Denote the pgfs of $A$ and $B$ as $\mathcal{G}_A$ and $\mathcal{G}_B$ respectively.

I want to write an expression for the pgf $\mathcal{G}_X$ of $X$. Here's what I've done so far, am I on the right track?

Let $Y\sim X$ and denote the probability measure on $A$ and $B$ as $\mathbb{P}_A$ and $\mathbb{P}_B$ respectively. Then: $$ \begin{align*} \mathbb{P}(Y=k) &= \begin{cases} \mathbb{P}_A(Y=k) & \textrm{with probability $p$} \\ \mathbb{P}_B(Y=k) & \textrm{with probability $1-p$} \end{cases} \\ &= p\mathbb{P}_A(Y=k) + (1-p)\mathbb{P}_B(Y=k) \end{align*} $$

Then $$ \begin{align*} \mathcal{G}_X(z) &= \sum_{k=0}^{\infty}z^k\mathbb{P}(Y=k) \\ &= \sum_{k=0}^{\infty}z^k\left(p\mathbb{P}_A(Y=k) + (1-p)\mathbb{P}_B(Y=k)\right) \\ &= \sum_{k=0}^{\infty}pz^k\mathbb{P}_A(Y=k) + \sum_{k=0}^{\infty}(1-p)z^k\mathbb{P}_B(Y=k) \\ &= p\sum_{k=0}^{\infty}z^k\mathbb{P}_A(Y=k) + (1-p)\sum_{k=0}^{\infty}z^k\mathbb{P}_B(Y=k) \\ &= p\mathcal{G}_A(z)+(1-p)\mathcal{G}_B(z) \end{align*} $$

Andrew
  • 13
  • 3

1 Answers1

2

The solution and the general method are both correct, but some of your notation is unnecessary (e.g., there is no need to define $Y$ at all). Note that this result is just a special case of a more general probability rule, which holds that that the probability generating function of any mixture random variable can be written as a weighted average of the probability generating functions of its mixture parts (with weights equal to the mixture probabilities).

Theorem: Let $X$ be a mixture of the random variables $A_1,...,A_k$ with mixture probabilities $p_1,...,p_k$, then the probability generating function of $X$ can be written as:$$\mathcal{G}_{X}(z) = \sum_{i=1}^k p_i \cdot \mathcal{G}_{A_i}(z).$$

Proof: From the stated mixture form we have $X = A_H$ where $H \sim \text{Categorical}(p_1,...,p_k)$. Thus, using the law of total expectation you have:$$\begin{equation} \begin{aligned} \mathcal{G}_{X}(z) = \mathbb{E}(z^X) &= \mathbb{E}(z^{A_H}) \\[6pt] &= \mathbb{E}( \mathbb{E}(z^{A_H}|H)) \\[6pt] &= \mathbb{E}( \mathcal{G}_{A_H}(z)) \\[6pt] &= \sum_{i=1}^k p_i \cdot \mathcal{G}_{A_i}(z), \\[6pt] \end{aligned} \end{equation}$$ which was to be shown. $\blacksquare$

Observe that nothing in this proof requires any assumption about the distributions of the random variables $A_1,...,A_k$ to be discrete.

Ben
  • 91,027
  • 3
  • 150
  • 376
  • +1 -- but by deleting the unnecessary word "convex" everywhere you will make the underlying idea even clearer and provide an interesting generalization. – whuber Aug 09 '19 at 12:16
  • @whuber: Edited. – Ben Aug 09 '19 at 14:45
  • Thank you. Note that the $p_i$ needn't be probabilities: there are applications where distributions are expressed as linear combinations of others with negative coefficients (or even complex coefficients). See https://stats.stackexchange.com/a/72486/919 for an example. – whuber Aug 09 '19 at 20:28
  • @whuber: Fair enough, but would you still call that a "mixture distribution"? – Ben Aug 09 '19 at 23:22
  • I understand why one might want to restrict the coefficients to positive numbers, but have yet to run into a situation where that restriction was necessary or useful. In some sense, allowing for negative coefficients is akin to introducing imaginary numbers in algebra: they might not admit the same interpretations as ordinary numbers, but they can be useful and ultimately might provide more satisfactory explanations of otherwise mysterious behaviors. – whuber Aug 10 '19 at 14:30