1

I recently asked a way to calculate BIC score for a given HMM (transition, emission, initial distribution). After doing some more research (basically the wiki page and this CV thread) I realized its a easy doable process. I didn't get any answers on my first thread but hopefully this one is a more specific question.

So for a given HMM, the BIC is scored with a formula.

> BIC = -2*logLike + num_of_params * log(num_of_data)

The formula requires two things that I don't know how to get. One is the number of params and the other is num of data. Based on above CV thread, the two general formulas seem to be

> Nparams = size(model.trans,2)*(size(model.trans,2)-1) + 
>           size(model.pi,2)-1) + 
>           size(model.emission.T,1)*(size(model.emission.T,2)-1)

and

> Nparams = Num_of_states*(Num_of_States-1) - Nbzeros_in_transition_matrix

Does the first formula work on every HMM with any nxn trans/emission matrix? And for the second formula can I say any transition probability less than $10^{-4}$ is effectively a zero?

Some context: I am using a software called ChromHMM which uses the Baum Welch algorithm on a multivariate data set to fit HMM for a given state size. I ran this software from 20 states to 40 states, giving me 20 models but I don't know which one to pick.

masfenix
  • 461
  • 4
  • 12
  • If I remember the details of Schwarz' paper right, that form of BIC derives from an asymptotic argument under a particular set of assumptions. The BIC has since been extended several times to kinds of models not covered by his initial derivation, but you can't simply assume the argument carries over to any model, it would need to be justified. It may have been justified for HMMs but I haven't seen it done. [Of course if you're not treating the formula for BIC you give above as relating to any Bayesian argument to identify $p$ but simply as some kind of penalized likelihood this may not matter] – Glen_b Sep 30 '14 at 00:07
  • I am not at all sure the original argument covers HMMs. You mention in other questions people using BIC for HMM in papers. Do they give some justification (or refer to any) for the form $-2\log\cal{L}+p\log(n)$ in the case of HMMs? – Glen_b Sep 30 '14 at 00:09
  • All the papers im reading that do similar research to mine (bioinformatics) often talk about using BIC to select the best model. Today I had a discussion with my supervisor who also agreed that I should use BIC (simply because that's what the industry/current papers are using). I would love to hear an alternative (I did bring up AIC). – masfenix Sep 30 '14 at 00:25
  • I'm not trying to dissuade you from using it ... but I presume you'd have some interest in whether it was actually justified for HMMs (I'd presume so, but you can't always tell); I'd be cautious about giving advice on a procedure that's possibly relying on only a presumption that it works. [If it is as widely used as you suggest, one would assume someone has already described, or at least hinted, how they count parameters in HMMs.] – Glen_b Sep 30 '14 at 00:29
  • In the current papers (like this for example): http://www.ncbi.nlm.nih.gov/pubmed/23104890 talks about the BIC criterion on page 2 under HMM training. Even the supplemental part of that paper dosnt have any details.. But I will ask my advisor for his opinion. – masfenix Sep 30 '14 at 00:35

0 Answers0