Whenever I look up material pertaining to Gaussian Mixture Models, it always mentions latent variable $z$, where $z \in \{1, ..., K\}$ and is one-hot encoded. I completely understand the objective of GMMs and how it is a linear superposition of Gaussians in the form $p(\textbf{x}) = \sum\limits_{k=1}^K \pi_k\mathcal{N}(\textbf{x}\vert\mu_k,\Sigma_k)$, but I don't understand what purpose the latent variable serves in this and why a joint distribution needs to be defined in terms of the marginal $p(\textbf{z})$ and $p(\textbf{x}\vert\textbf{z})$. Is there an intuitive reason for the variable $\textbf{z}$?
Asked
Active
Viewed 481 times
1 Answers
3
The latent variable $\mathbf{z}$ is the allocation vector that attributes to each observation $x_i$ of the sample its component indicator $z_i$, i.e.,$$\mathbb{P}(Z_i=\xi)=p_\xi\qquad\qquad X_i|Z_i=\xi \sim \mathcal{N}(\mu_\xi,\Sigma_\xi)$$The latent variable is used in the EM algorithm and the Gibbs sampler. It is also a key notion behind clustering & classification.

Xi'an
- 90,397
- 9
- 157
- 575