What is meant by divergence in statistics?

Question

I have learned about the Intuition on the Kullback-Leibler (KL) Divergence as how much a model distribution function differs from the theoretical/true distribution of the data.

The two most important divergences are the relative entropy (Kullback–Leibler divergence, KL divergence), which is central to information theory and statistics, and the squared Euclidean distance (SED).

Is divergence a distance/measure of the distance?

What is the cleanest, easiest way to explain to someone the concept of divergence? Could anyone explain in detail about divergence in layman's terms?

while 'convergence' is pretty unambiguous, 'divergence' can indicate the opposite of 'convergence' or a completely different thing. — carlo, Apr 20 '21 at 14:53

score 2 · Accepted Answer · answered Apr 21 '21 at 16:55

A divergence is a function that takes two probability distributions as input, and returns a number that measures how much they differ. The number returned must be non-negative, and equal to zero if and only if the two distributions are identical. Bigger numbers indicate greater dissimilarity.

Informally, people sometimes describe divergences as measuring the "distance" between probability distributions. This risks confusion with formal distance metrics, which must satisfy some extra requirements. In addition to the requirements above, a distance metric must also be symmetric: $D(a,b) = D(b,a)$. And, it must satisfy the triangle inequality: $D(a,c) \le D(a,b) + D(b,c)$. As a side note, divergences are defined specifically on probability distributions, whereas distance metrics can be defined on other types of objects too.

All distance metrics between probability distributions are also divergences, but the converse is not true--a divergence may or may not be a distance metric. For example, the KL divergence is a divergence, but not a distance metric because it's not symmetric and doesn't obey the triangle inequality. In contrast, the Hellinger distance is both a divergence and a distance metric. To avoid confusion with formal distance metrics, I prefer to say that divergences measure the dissimilarity between distributions.

What is meant by divergence in statistics?

1 Answers1