7

I want to build a hidden Markov model (HMM) with continuous observations modeled as Gaussian mixtures (Gaussian mixture model = GMM).

The way I understand the training process is that it should be made in $2$ steps.

1) Train the GMM parameters first using expectation-maximization (EM).

2) Train the HMM parameters using EM.

Is this training process correct or am I missing something?

Chill2Macht
  • 5,639
  • 4
  • 25
  • 51
notArefill
  • 279
  • 2
  • 4
  • 13
  • 4
    Been working on HMMs for some years now and to me the best tutorial ever to understand the training of Gaussian Mixture HMMs is here: http://web.stanford.edu/class/ee378b/papers/bilmes-em.pdf Equations explained step by step. ;) – Eskapp Nov 18 '16 at 14:34
  • 1
    Btw, https://github.com/hmmlearn/hmmlearn is a very nice (maybe the only actually) library that is simple enough to use, supports HMMs with GMM emissions and has an adequate documentation. If for whatever reason you want to do the implementation yourself, you can dive into the files. – DimP Apr 26 '17 at 23:43
  • 6
    @Eskapp The linked tutorial is blocked by Stanford's login. Is there another way to read it? – aepound Sep 26 '17 at 15:05
  • @aepound Yes, here http://melodi.ee.washington.edu/people/bilmes/mypapers/em.pdf or http://lasa.epfl.ch/teaching/lectures/ML_Phd/Notes/GP-GMM.pdf (very long version) – Eskapp Sep 26 '17 at 15:11
  • @Eskapp It seems that both of these links are dead. Has this article been published anywhere? Can you give a full citation? Or a working link? – Sycorax Jan 07 '21 at 21:28
  • 1
    @Sycorax https://www.cs.cmu.edu/~aarti/Class/10701/readings/gentle_tut_HMM.pdf A Gentle Tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models by Jeff A. Bilmes. Also hosted here: https://imaging.mrc-cbu.cam.ac.uk/methods/BayesianStuff?action=AttachFile&do=get&target=bilmes-em-algorithm.pdf – Eskapp Jan 07 '21 at 22:08

4 Answers4

8

In the reference at the bottom $^*$, I see the training involves the following:

  1. Initialize the HMM & GMM parameters (randomly or using prior assumptions).

    Then repeat the following until convergence criteria are satisfied:

  2. Do a forward pass and backwards pass to find probabilities associated with the training sequences and the parameters of the GMM-HMM.

  3. Recalculate the HMM & GMM parameters - the mean, covariances, and mixture coefficients of each mixture component at each state, and the transition probabilities between states - all calculated using the probabilities found in step 1.

$*$ University of Edinburgh GMM-HMM slides (Google: Hidden Markov Models and Gaussian Mixture Models, or try this link). This reference gives a lot of details and suggests doing these calculations in the log domain.

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
Alex
  • 297
  • 3
  • 9
3

This paper[1] is absolute classic and has the whole HMM machinery for gaussian mixture laid out for you. I think it's fair to say Rabiner made the first important step in speech recognition with GMM in 1980s.

[1] Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257-286.

horaceT
  • 3,162
  • 3
  • 15
  • 19
  • 3
    I am aware of this paper, and I have read it. But what you have written is not an answer, you just point me to paper, and does not answer my question. Also my question was not about Rabiner, so this sentence is "I think it's fair to say Rabiner made the first important step in speech recognition with GMM in 1980s.") is irrelevant to my question. – notArefill Sep 12 '16 at 16:23
1

pomegranate is another python library that provides GMM and HMM with even better documents than hmmlearn. Currently I prepare transfer from hmmlearn to it. http://pomegranate.readthedocs.io/en/latest/GeneralMixtureModel.html

  • I'm not seeing support for continuous observations in that package. Does it support continuous observations? I've found some packages that do gaussian HMMs and some that do continuous observation HMMs, but I can't find one that does both at once. – Danny May 01 '19 at 20:53
-1

Assuming your HMM uses Gaussian Mixture, for parameters estimation, you perform forward and backward pass and update the parameters. The difference is that you need to include normal pdf mixture as the probability of observation given a state. So, for transition probability estimation, you do it just like a discrete observation HMM, but to re-estimate the mean, variance(or covariance matrix for multivariate case), and mixture weights, you introduce a new formula for probability of being in state i at time t with m-th mixture component accounting for the observation at t, which is simply normalized alpha*beta * normalized c*N(o,u,var) alpha and beta are the forward and backward formulas in Baum-Welch and c = m-th mixture weight while being in state i, o = observation at t, u = mean or mean vector, var = variance or covariance matrix

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
Kevvy Kim
  • 349
  • 3
  • 12
  • actually, the link given by an answerer Alex has pretty good description – Kevvy Kim Dec 29 '16 at 06:10
  • Could you expand on how this adds to the previous answers? – mdewey Dec 29 '16 at 09:37
  • If you mean train by re-estimating the parameters, then the page 18 of the link Alex has linked gives the equations. I could've given the equations here, but as I don't know how to put formulas, I would have a hard time explaining it in words. – Kevvy Kim Jan 02 '17 at 13:37