Given: Two continuous multivariate probability distributions, expressed as mixture models (possibly, but not necessarily, Gaussian Mixture Models)
Desired Output: The Hellinger Distance between the two probability distributions. The probability distributions then to be used to generate data streams.
Constraints: I would like to code this in Java, for implementation with the MOA framework. With Java it is possible to import the Apache Commons Math classes, but if need be, I can implement this in another language and save the output as ARFF files.
What I know so far: The Hellinger distance expresses the similarity between two probability distributions. From Wikipedia I know that if we define the Hellinger distance in terms of elementary probability theory, we simply have two probability density functions:
If we denote the densities as $f$ and $g$, respectively, the squared Hellinger distance can be expressed as a standard calculus integral: $H^2(f,g)=\frac{1}{2}\int(\sqrt{f(x)}-\sqrt{g(x)})^2dx$.
This definition is the most interpretable to me that I have found and seems like it will lend itself to both the construction of the mixture models by a computer and the calculation of the Hellinger distance by a computer.
Questions:
Is there a way to express this form of the Hellinger Distance that lends itself to easy (or at least natural) computation by a computer?
Is there another way of framing the calculation that would be more natural?
Can all of this be sidestepped and the Hellinger Distance approximated using sampling?