How can I write the probability distribution of a vector of probabilities when a subset of this vector is specified to be Dirichlet?

Question

My objective is to write the form of $p(\mathbf{q})$, where $q_j$ is the probability of some event $j$ happening. Suppose the dimension of this vector is $N$. However, we know that only $n(<N)$ are non-zero probabilities. Denoting these non-zero probabilities as $\bf{q}^*$, we are given that $$\mathbf{q}^*\sim \text{Dirichlet}(n, \alpha_1,..,\alpha_n)$$

Knowing the prior of the subset of $\mathbf{q}$, is there a way we can easily define the prior of the entire $\mathbf{q}$?

Great question! Short answer: We do not. Long answer: This is just one of a few priors in a pretty involved hierarchical model; within this model, we are actually trying to determine which of the $n$ components are nonzero. However, in the scope of this question, we do not know which of the $n$ components are nonzero. — Oski, Feb 24 '22 at 17:08
I suppose it comes down to what you consider "easily define" to mean. It *sounds* like you would want to adopt a uniform prior on the set of $n$-subsets of the $N$ vector components. Would this be correct? If so, what would constitute an "easy" description of this prior? — whuber, Feb 24 '22 at 17:11
@whuber I think I see what you mean. Please correct me if I'm interpreting you incorrectly. So, since we don't know which of the $n$ components are non-zero, you're suggesting that we perform a discrete uniform draw from the set of possible $n$-subsets of the $N$ to determine which of the $n$ components we will treat as non-zero. Denoting $\mathcal{N}$ as the set of possible $n$-subsets, we can write the desired prior as $p(\mathbf{q})=U(\mathcal{N})p(\mathbf{q}_0^*)$, where $\mathbf{q}^*_0$ is a vector of the $n$ components that we selected from our $U(\mathcal{N})$. Is this right? — Oski, Feb 24 '22 at 17:23
@whuber Thank you for your help! May I ask why this is different from using combinatorics to determine which of the $n$ components are nonzero? For example, instead of using $U(\mathcal{N})$, what if we used ${N \choose n}$ to determine which components are non-zero? In this case, we could have $p(\mathbf{q})={N \choose n}p(\mathbf{q}^*_0)$. — Oski, Feb 24 '22 at 17:42
@whuber If we were to know which $n$ components were nonzero, how would that change our expression for $p(\mathbf{q})$? For example, suppose that we know the first $n$ components of $\mathbf{q}$ are nonzero. Then, we can write $p(\mathbf{q})=(\tilde{\mathbf{q}},0,...,0)$, where there are $N-n$ zeros. Would this allow us to get a different expression for $p(\mathbf{q})$? — Oski, Feb 25 '22 at 16:31
I don't know, because your expression is informal and a little unclear. Ultimately, you have to describe a *singular* distribution: all its probability is concentrated on a set of zero volume. Usually the easiest way to do that is to push a distribution forward: describe it mathematically in a smaller space, where it is a continuous distribution, and give a mapping into the original space. Then, you can use the inverse of that mapping to do all your analytical work in the smaller space. At https://stats.stackexchange.com/a/159322/919 I provide the details for Normal distributions. — whuber, Feb 25 '22 at 16:37
@whuber Thank you for your help. The linked post is giving me a lot to think about. I appreciate your time. I don't have enough reputation to upvote your comments, but I surely will in the future! :) — Oski, Feb 25 '22 at 16:52

How can I write the probability distribution of a vector of probabilities when a subset of this vector is specified to be Dirichlet?

0 Answers0