I have three discrete probability distributions, A, B and C. They are all measuring P(X) under different circumstances. I suspect that A is more similar to B than it is to C. I know that I can compare the difference between distributions with KL divergence, but how can I test whether the difference between A-B is less than the difference between A-C?
Asked
Active
Viewed 188 times
4

kjetil b halvorsen
- 63,378
- 26
- 142
- 467

Jig-nificant
- 41
- 1
-
2You have a tough problem, because statistical and mathematical theory will not decide the answer: you are assuming there is a relevant way to compare distributions so that "less than" has meaning. *What* meaning it might have is up to you to decide: that's not something we can tell you--although we can provide some guidance, if you would explain how you intend to interpret the result. – whuber Jun 12 '18 at 12:59
-
You seem to answer your own question: by comparing the two KL-divergences. Of course that means you're committing to defining "difference between" as "KL-divergence from". – Mees de Vries Jun 12 '18 at 13:01
-
1Rather than comparing the distributions as a whole, can you not compare specific aspects of the disrubutions, captured by relevant quantiles or functions of quantiles? This might give you more insights into where the distributions differ (e.g., tails). – Isabella Ghement Jun 12 '18 at 13:09
1 Answers
1
You already got some hints in comments, and a request for more information, which you didn't give us. Here are some thoughts:
Observations on three discrete variables, presumably defined on the same categories, can be represented as a contingency table. Then you do a correspondence analysis, see Interpreting 2D correspondence analysis plots. The results can be presented graphically, and the three variables can be compared with the so-called chisquare distance. A similar graphical analysis can probably be based on the KL divergence. To say more we need more context.

kjetil b halvorsen
- 63,378
- 26
- 142
- 467