Kullback-Leibler divergence: negative values?

Question

Wikipedia - KL properties says that KL can never be negative. But e.g. for texts where the probabilities are very small I somehow get negative values? E.g.

Collection A: - word count: 321 doc count: 65888 probA: 0,004871904

Collection B: - word count: 1244 doc count: 120344 probB: =0,010337034

KL = $0.004871904 \cdot \ln\frac{0.004871904}{0.010337034} = -0.003664881$

So, looking at @SamLivingstone's answer below, you need to add the analogous second term to your sum. — Stephan Kolassa, Oct 27 '12 at 14:14

score 27 · Accepted Answer · answered Oct 27 '12 at 13:29

KL-divergence is the sum of $q(i)\log\frac{q(i)}{p(i)}$ across all values of $i$. You've only got one instance ($i$) in your equation. For example, if your model was binomial (only two possible words occurred in your document) and $Pr(word1)$ was 0.005 in document 1 and 0.01 in document 2 then you would have:

\begin{equation} KL = 0.005*\log\frac{0.005}{0.01} + 0.995*\log\frac{0.995}{0.99} = 0.001547 \geq 0. \end{equation}

This sum (or integral in the case of continuous random variables) will always be positive, by the Gibbs inequality (see http://en.wikipedia.org/wiki/Gibbs%27_inequality).

Yes, right. The sum is always > 0. – Andreas Oct 27 '12 at 14:47 — Andreas, Oct 27 '12 at 14:47

Kullback-Leibler divergence: negative values?

1 Answers1