10

I am a bit confused. Can someone explain to me how to calculate mutual information between two terms based on a term-document matrix with binary term occurrence as weights?

$$ \begin{matrix} & 'Why' & 'How' & 'When' & 'Where' \\ Document1 & 1 & 1 & 1 & 1 \\ Document2 & 1 & 0 & 1 & 0 \\ Document3 & 1 & 1 & 1 & 0 \end{matrix} $$

$$I(X;Y)= \sum_{y \in Y} \sum_{x \in X} p(x,y) \log\left(\frac{p(x,y)}{p(x)p(y)} \right)$$

Thank you

Siong Thye Goh
  • 6,431
  • 3
  • 17
  • 28
user18075
  • 617
  • 1
  • 6
  • 14

1 Answers1

7

How about forming a joint probability table holding the normalized co-occurences in documents. Then you can obtain joint entropy and marginal entropies using the table. Finally, $$I(X,Y) = H(X)+H(Y)-H(X,Y). $$

Zoran
  • 522
  • 4
  • 11
  • 1
    When the joint and marginal distributions have been determined, why is it necessary to compute $H(X)$, $H(Y)$ and $H(X,Y)$ and use the formula you exhibit? Can't the mutual information be determined directly via the formula given by the OP since everything needed for "plugging in", viz. $p(x,y), p(x)$ and $p(y)$ are known at this point? – Dilip Sarwate Dec 29 '12 at 19:33
  • 2
    formulas are equivalent except the latter can be more interpretable at first glance. – Zoran Dec 29 '12 at 20:55