1

Say I observe $N$ observations $\{x_1, \dots, x_N\}$ from a $k$ component Gaussian Mixture model, with $k > 0$ known and is such that each $x_i|\boldsymbol{\pi}, \boldsymbol{\mu} \sim \sum_{j=1}^{k} \pi_j \mathcal{N}(\mu_j, \sigma_j)$, with each $\sigma_j$ also known for component $j = 1, \dots, k$. The vector of mixing weights $\boldsymbol{\pi} = (\pi_1, \dots, \pi_k)$ and means $\boldsymbol{\mu} = (\mu_1, \dots, \mu_k)$ are unknown.

Let's say also that the label of each observation to its group is unknown; i.e. let $z_i \in \{1, \dots k\}$ be an allocation label, allocating one of the $k$ groups to observation $i = 1, \dots, N$. Marginally, we have that $\mathbb{P}(z_i = j) = \pi_j$ for $j = 1, \dots, k$. However, I have another unknown parameter $\gamma >0$ which is only related to the number of counts observed from each component; i.e. I know the probability distribution $\mathbb{P}(s_j|\gamma)$, where $s_j = \#(l: z_l = j)$, for each $j$.

I can construct a Gibbs sampler to sample from the conditionals $\boldsymbol{\pi}, \boldsymbol{\mu}$, when their prior distributions are dirichlet and Gaussian respectively. However, I am stuck on finding the conditional distributions of $z_1, \dots z_N$ and $\gamma$ given all other unknown parameters.

Is it true that \begin{align*} f(\gamma| z_1, \dots, z_N) & \propto \mathbb{P}(s_1, \dots, s_k| \gamma) f(\gamma) \\ & \propto f(\gamma)\Pi_j \mathbb{P}(s_j|\gamma), \end{align*} where $f(\gamma)$ denotes the prior distribution of $\gamma$?

And if so, does this mean that to sample from $z_1, \dots, z_N$, that for each $j$, $$ \mathbb{P}(z_i = j|\gamma, z_1, \dots, z_{i-1}, z_{i+1}, \dots, z_N,\boldsymbol{\mu}, \boldsymbol{\pi}) \propto $$ $$\pi_j \exp\left(-\frac{1}{2\sigma_j^2} (x_i-\mu_j)^2 \right) \frac{\mathbb{P}(s_j = d+1|\gamma)}{\int_{0}^{\infty} \mathbb{P}(s_j = d+1|\gamma) f(\gamma) d \gamma},$$ where $d = \#(l \neq i: z_l = j)?$

Or should I further condition on the number of counts $s_j$, for $j=1, \dots,k$? Any help would be kindly appreciated! Thanks.

user202654
  • 221
  • 1
  • 8
  • 1
    $f(\gamma| z_1, \dots, z_N) \propto \mathbb{P}(s_1, \dots, s_k| \gamma) f(\gamma)$ looks ok, but $\propto f(\gamma)\Pi_j \mathbb{P}(s_j|\gamma)$ implies that the $s_j$ are independent. I don't think they can be independent, as when you increase one, the others must decrease (to continue to add up to $N$). – papgeo Dec 23 '18 at 11:53
  • Ah you are correct, thanks. Do you have any idea as to how I would derive this then? Do I need to multiply by a factor ${N \choose s_1, \dots s_k}$? – user202654 Dec 23 '18 at 13:46
  • 1
    Yes, the joint of the $s_j$ is the multinomial. – papgeo Dec 24 '18 at 03:04
  • Thanks, although I’m not sure how specifying the count distribution of the $s_1, ...,s_k$ would be involved in this? – user202654 Dec 24 '18 at 16:55

1 Answers1

1

It is not possible to set the distribution of the $Z_i$'s on the one hand and of the $s_j$'s on the other hand as if they were unrelated. Setting a distribution on the $Z_i$'s implies that $(S_1,\ldots,S_k)$ has a multinomial distribution $$\mathcal{M}_k(n;\pi_1,\ldots,\pi_k)$$

Xi'an
  • 90,397
  • 9
  • 157
  • 575
  • Ok thanks for the comment, I do understand. What if I specify that each $\pi_j = s_j/N$ and sample from the $s_j$s? – user202654 Dec 26 '18 at 00:59