6

I have two estimates of two points in space - each has a 3D position and a cigar shaped 3x3 covariance matrix and I am checking the hypothesis that these observations are actually referencing the one and same point. So I would like to calculate the agreement of the two observations with this assumption.

A search brings up Bhattacharyya distance, or Kullback–Leibler divergence as candidates. I am not looking for the most correct estimate, but rather an easy to implement function which takes two positions and two 3x3 matrices and returns a percentage or distance in standard deviations.

Here are some similar threads:

Mahalanobis distance between two bivariate distributions with different covariances

Measures of similarity or distance between two covariance matrices

bgp2000
  • 111
  • 1
  • 8
  • Have you considered the euclidean distance? If so, why is it inadequate? – Olivier Mar 27 '17 at 14:19
  • What's wrong with Bhattacharyya distance? It seems appropriate. – amoeba Mar 29 '17 at 14:45
  • 1
    @Olivier: The euclidian distance wouldn't take the covariances into account at all. – bgp2000 Mar 29 '17 at 15:27
  • @amoeba: Then the question becomes how to implement the bhattacharyya distance: http://stats.stackexchange.com/questions/81889/how-to-get-bhattacharyya-distance-in-excel-or-matlab-or-r I tried to port the R code referenced in that post: `bhattacharyya.dist – bgp2000 Mar 29 '17 at 15:30
  • The formula for multivariate normal distributions is written out here https://en.wikipedia.org/wiki/Bhattacharyya_distance. Are you asking why it has 1/8 factor? No idea, but that's definitely a separate question. Also, it doesn't really matter what the scalar factor there is. – amoeba Mar 29 '17 at 15:33
  • I expected `bhattacharyya(mu1, mu2, S, S) == mahalanobis(mu1, mu2, S)` to be true, but I guess no one promised me that. It would help if the two measures were comparable, though. – bgp2000 Mar 29 '17 at 15:39
  • For the record, Fisher-Rao http://www.scholarpedia.org/article/Fisher-Rao_metric distance between two 3-variate normal distributions is an appropriate distance between two covariances or two $\mathcal{N}$(mean(or position), covariance). – mic Mar 29 '17 at 16:01
  • @bgp2000 Do you know about the euclidean distance between matrices? :^) Fancy people call it the Frobenius distance. I'm pretty sure it takes into account the matrices you give it ;) – Olivier Mar 29 '17 at 19:20
  • @Olivier OK, but how would the Frobenius norm work exactly? I have two Vec3's and two Mat3x3's. – bgp2000 Mar 29 '17 at 19:29
  • You can put everything in a long vector and use any distance you want. You can also sum the distance between the points with the distance between the matrix. The possibilities are endless. That's why it's hard to suggest something better than the euclidean distance without some more context... – Olivier Mar 29 '17 at 19:58
  • @bpg2000, Hmm, Wikipedia says that "the Mahalanobis distance is a particular case of the Bhattacharyya distance when the standard deviations of the two classes are the same". Are you sure it's not the case? – amoeba Apr 01 '17 at 19:45
  • So, I guess someone did promise me that;) Yeah, that seems odd to me to. – bgp2000 Apr 01 '17 at 19:53

1 Answers1

5

In the end I went for the Bhattacharyya distance. I adapted the R code referenced here:

// In the following, Vec3 and Mat3 are C++ Eigen types.

/// See: https://en.wikipedia.org/wiki/Mahalanobis_distance
double mahalanobis(const Vec3& dist, const Mat3& cov)
{
    return (dist.transpose()*cov.inverse()*dist).eval()(0);
}

/// See: https://en.wikipedia.org/wiki/Bhattacharyya_distance
double bhattacharyya(const Vec3& dist, const Mat3& cov1, const Mat3& cov2)
{
    const Mat3 cov = (cov1+cov2)/2;
    const double d1 = mahalanobis(dist, cov)/8;
    const double d2 = log(cov.determinant()/sqrt(cov1.determinant()*cov2.determinant()))/2;
    return d1+d2;
}
bgp2000
  • 111
  • 1
  • 8