Why the two conditional entropy are not comparable?

Question

I am learning the basics of text mining. For finding the syntagmatic relations in the text like the word "technology" occurs whenever the word "information" occurs i.e co-occurrence of words "Information Technology".

So one measure used for quantifying the relationship is using conditional entropy.

$H(X_1 | X_2)$ i.e conditional entropy of occurrence of word $X_1$ given that word $X_2$ occurred in that document.

$H(X_1 | X_2)$ & $H(X_1 | X_3)$ : It gives the randomness of word $X_1$ if $X_2$ occur and $X_3$ occurs. The words are co-occurring if there is less randomness. So we select the pairs which are having less conditional entropy given a threshold selected.

So what this $H(X_1 | X_2 )$ and $H(X_3 | X_2)$ will capture the information of $X_2$ with word $X_3$ and $X_1$?

Also why $H(X_1 | X_2)$ and $H(X_3 | X_2)$ are not comparable?

score 0 · Answer 1 · answered Nov 04 '17 at 12:12

0

H(X1|X2) and H(X3|X2) are not comparable because the upper bounds are different.

And, conditional entropy is not symmetric

H(X1|X2) != H(X2|X1)

answered Nov 04 '17 at 12:12

Aida Haliti

1

Why the two conditional entropy are not comparable?

1 Answers1