I am a bit confused. Can someone explain to me how to calculate mutual information between two terms based on a term-document matrix with binary term occurrence as weights?
$$ \begin{matrix} & 'Why' & 'How' & 'When' & 'Where' \\ Document1 & 1 & 1 & 1 & 1 \\ Document2 & 1 & 0 & 1 & 0 \\ Document3 & 1 & 1 & 1 & 0 \end{matrix} $$
$$I(X;Y)= \sum_{y \in Y} \sum_{x \in X} p(x,y) \log\left(\frac{p(x,y)}{p(x)p(y)} \right)$$
Thank you