20

According to this wikipedia article, one can represent the product of probabilities x⋅y as -log(x) - log(y) making the computation more computationally optimal. But if I try an example say:

p1 = 0.5
p2 = 0.5
p1 * p2 = 0.25
-log(p1) - log(p2) = 2

p3 = 0.1
p4 = 0.1
p3 * p4 = 0.01
-log(p3) - log(p4) = 6.64

The product of probabilities p1 and p2 is higher then the one of p3 and p4, but the log probability is lower.

How come?

Glen_b
  • 257,508
  • 32
  • 553
  • 939
spacemonkey
  • 365
  • 1
  • 2
  • 7
  • 3
    What's wrong? Smaller probabilities _will_ give larger values because $-\log p$ increases from $0$ when $p=1$ towards $\infty$ as $p \to 0$. – Dilip Sarwate Oct 24 '14 at 01:48
  • 6
    (+1) Why downvote? I think this is a well-written on-topic question, albeit very elementary. – Juho Kokkala Oct 24 '14 at 08:48
  • @DilipSarwate my problem is not with the math part, but with this particular way of representing probabilities. Maybe it is just a matter of getting comfortable with it. – spacemonkey Oct 24 '14 at 14:08

1 Answers1

27

I fear you have misunderstood what the article intends. This is no great surprise, since it's somewhat unclearly written. There are two different things going on.

The first is simply to work on the log scale.

That is, instead of "$p_{AB} = p_A\cdot p_B$" (when you have independence), one can instead write "$\log(p_{AB}) = \log(p_A)+ \log(p_B)$". If you need the actual probability, you can exponentiate at the end to get back $p_{AB}$: $\qquad p_{AB}=e^{\log(p_A)+ \log(p_B)}\,,$ but if needed at all, the exponentiation would normally be left to the last possible step. So far so good.

The second part is replacing $\log p$ with $-\log p$. This is so that we work with positive values.

Personally, I don't really see much value in this, especially since it reverses the direction of any ordering ($\log$ is monotonic increasing, so if $p_1<p_2$, then $\log(p_A)< \log(p_2)$; this order is reversed with $-\log p$).

This reversal seems to concern you, but it's a direct consequence of the negation - it should happen with negative log probabilities. Think of negative log probability as a scale of "rarity" - the larger the number, the rarer the event is (the article refers to it as 'surprise value', or surprisal, which is another way to think about it). If you don't like that reversal, work with $\log p$ instead.

To convert negative-log-probabilities back to probabilities, you must negate before exponentiating. If we say $s_i = -\log(p_i)$ ($s$ for 'surprise value'), then $p_{AB}=e^{-[s_A+ s_B]}\,.$ As you see, that reverses direction a second time, giving us back what we need.

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
Glen_b
  • 257,508
  • 32
  • 553
  • 939
  • 4
    +1 "Think of negative log probability as a scale of "rarity" - the larger the number, the rarer the event is" – Zhubarb Oct 24 '14 at 16:17