Pros of Jeffries Matusita distance

Question

According to some paper I am reading, Jeffries and Matusita distance is commonly used. But I couldn't find much information on it except for the formula below

JMD(x,y)=$\sqrt[2]{\sum(\sqrt[2]{x_i}-\sqrt[2]{y_i})^2}$

It is similar to Euclidean distance except for the square root

E(x,y)=$\sqrt[2]{\sum(x_i-y_i)^2}$

JM distance is claimed to be more reliable than Euclidean distance in term of classification. Can anyone explain why this difference make JM distance better?

I cannot find an authoritative reference that uses this formula for the Jeffries-Matusita distance. The formulas I do find are based on covariance matrices for two classes and appear to have no relationship to the one given here, but it seems that there may be two (or more) different things known by this name. Could you supply a reference or (even better) a link? BTW, are $x_i$ and $y_i$ *counts* by any chance? (If so, there is a natural interpretation of your formula.) — whuber, Jun 10 '14 at 13:51
@whuber: maybe the $x$ and $y$ are stand in for [$p(x)$ and $q(x)$](http://phdfb1.free.fr/phdthesis/node70.html) — user603, Jun 10 '14 at 16:15
@user603 Yes, I think you've got it. Now the connections to KL divergences and the Battacharyya measure become apparent. — whuber, Jun 10 '14 at 17:25

rroowwllaanndd · Accepted Answer · 2014-06-11T14:52:34.553

Some key differences, preceding a longer explanation below, are that:

Crucially: the Jeffries-Matusita distance applies to distributions, rather than vectors in general.
The J-M distance formula you quote above only applies to vectors representing discrete probability distributions (i.e. vectors that sum to 1).
Unlike the Euclidean distance, the J-M distance can be generalised to any distributions for which the Bhattacharrya distance can be formulated.
The J-M distance has, via the Bhattacharrya distance, a probabilistic interpretation.

The Jeffries-Matusita distance, which seems to be particularly popular in the Remote Sensing literature, is a transformation of the Bhattacharrya distance (a popular measure of the dissimilarity between two distributions, denoted here as $b_{p,q}$) from the range $[0, \inf)$ to the fixed range $[0, \sqrt{2}]$:

$$ JM_{p,q}=\sqrt{2(1-\exp(-b(p,q))} $$

A practical advantage of the J-M distance, according to this paper is that this measure "tends to suppress high separability values, whilst overemphasising low separability values".

The Bhattacharrya distance measures the dissimilarity of two distributions $p$ and $q$ in the following abstract continuous sense: $$ b(p,q)=-\ln\int{\sqrt{p(x)q(x)}}dx $$ If the distributions $p$ and $q$ are captured by histograms, represented by unit length vectors (where the $i$th element is the normalised count for $i$th of $N$ bins) this becomes: $$ b(p,q)=-\ln\sum_{i=1}^{N}\sqrt{p_i\cdot q_i} $$ And consequently the J-M distance for the two histograms is: $$ JM_{p,q}=\sqrt{2\left(1-\sum_{i=1}^{N}{\sqrt{p_i\cdot q_i}}\right)} $$ Which, noting that for normalised histograms $\sum_{i}{p_i}=1$, is the same as the formula you gave above: $$ JM_{p,q}=\sqrt{\sum_{i=1}^{N}{\left(\sqrt{p_i} - \sqrt{q_i}\right)^2}}=\sqrt{\sum_{i=1}^{N}{\left(p_i -2 \sqrt{p_i}\sqrt{q_i} + q_i \right)}}=\sqrt{2\left(1-\sum_{i=1}^{N}{\sqrt{p_i\cdot q_i}}\right)} $$

+1 Many thanks for jumping in and making this very well done effort to clarify the situation. — whuber, Jun 11 '14 at 15:07

Pros of Jeffries Matusita distance

1 Answers1

Linked