9

I'm looking for some kind of distribution over the simplex in which components are correlated in an ordinal way. That is, if $p = (p_1, ..., p_J)$ is drawn from our distribution on the simplex, I would like $p_i$ to be positively correlated with its neighbours $p_{i + 1}$ and $p_{i - 1}$, say. A vanilla Dirichlet clearly cannot satisfy this requirement. One option I suppose is a mixture of Dirichlet distributions; for example, when $J = 4$ one could take $\mathcal D(1, 1, 0, 0) + \mathcal D(0, 1, 1, 0) + \mathcal D(0, 0, 1, 1)$ or something similar to induce correlation, but I'm wondering if there is something a little more natural. Another option I suppose is to take any distribution on $\{1, 2, ..., J\}$, say $f(j | \eta)$, put a distribution on $\eta$ take $p_j = f(j | \eta)$. So I could take, for example, $\eta \sim \mbox{Beta}(\alpha, \beta)$ and let $f(j | \eta) = {J \choose j} \eta^j (1 - \eta)^{J - j}$.

At any rate, I'd like whatever I end up with to be as tractable as possible. The mixture of Dirichlet's is appealing because I could get some nice conditional conjugacy going for me, but it's not clear how to set things up. This question talks about the logistic normal distribution, but I don't know much about it; is it tractable for Bayesian inference?

Of course, the components of a Dirichlet are already negatively correlated, and asking for "positive correlation" probably isn't totally coherent since if $p_i$ is large then it is, by nature, taking up most of the mass and hence forcing the probability of its neighbours to be small. Perhaps what I mean is that $p_i$ is positively correlated with $p_{i + 1} / \sum_{j \ne i} p_j$. Hopefully the question as stated is enough for people to know what I want and be able to help me.

guy
  • 7,737
  • 1
  • 26
  • 50

1 Answers1

8

One way to have a random $\theta=(\theta_1,\dots,\theta_k)$ living on the simplex, without the limitations imposed by the negative covariances of the Dirichlet distribution, is to define $\phi_i=\sum_{j=1}^k c_{ij} \log \theta_j$, for $i=1,\dots,k-1$, where the $(k-1)\times k$ matrix $C=(c_{ij})$ has rank $k-1$. Adding the constraint $\sum_{i=1}^k\theta_i=1$, any $k-1$ dimensional normal distribution may be assigned to $\phi=(\phi_1,\dots,\phi_{k-1})$.

Bayesian inference is tractable within this rich class of distributions introduced and studied by Aitchison in a series of papers

Journal of the Royal Statistical Society, $\textbf{B}$, $\textbf{44}$, 139-177 (1982),

Journal of the Royal Statistical Society, $\textbf{B}$, $\textbf{47}$, 136-146 (1985);

and in his book

$\textit{The Statistical Analysis of Compositional Data}$. Chapman & Hall: London (1986).

Zen
  • 21,786
  • 3
  • 72
  • 114