6

I have multiple sets of discrete probability histograms(vectors) and I want to measure the distance between each histogram. I have done some research but I am in doubt.

Literature suggest I could use the Bhattacharyya distance or the Hellinger distance (both are closely related). Which one should I use.

Some additional info. prob Histogram(vector) $A= (.18, .61, .16, .05)$ and prob Histogram (vector) $B= (.26, .55, .16, .03)$. I want to calculate the distance/difference among these two probability Histograms. The Bhattacharyya coefficient ($BC$) =

$$ \sqrt{.18 \times .26}+\sqrt{.61 \times .55}+\sqrt{.16 \times .16}+\sqrt{.05 \times .03} \; . $$

Consequently the Bhattacharyya distance = $-\ln(BC)$ and Hellinger distance = $2 \sqrt{1-BC}$.

Is this the right measure and is this the right calculation?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Jasper
  • 63
  • 1
  • 4
  • Is not the Hellinger Distance = $\sqrt {1−BC} $ ? That's what I have always used - and seems to be confirmed in a few references i have here. Not sure where the 2 factor comes into it? – Mari153 Jan 07 '22 at 11:53

2 Answers2

6

Jensen-Shannon distance is the 1st thing I'd consider. If you don't insist on having a "distance function", you can directly use Jensen–Shannon divergence, from which this distance is derived.

JS divergence is widely used to measure the difference between two probability distributions. It fits your case, as the inputs are two probability vectors. JS divergence is a straightforward modification of the well-known Kullback–Leibler divergence.

Generally, KL and JS divergence require the input vectors have nonzero entries. In case of zeros in the input, many people simply choose to throw out those values. Check https://mathoverflow.net/a/72672 for more details on this issue.

Weiwei
  • 670
  • 4
  • 11
1

There is also the Wasserstein metric, which has become more popular in the Bayesian inference literature. In your case, the 'wasserstein1d' function from package 'transport' should do the trick:

  x <- rnorm(200)
  y <- rnorm(150,2)
  wasserstein1d(x,y)
Seb
  • 111
  • 1