On the denominator of Bayes and Naive Bayes

Question

There are many question on the topic but I think the comparison below is harder to find.

Let us assume that all variables in this example are binary for the sake of simplicity. The typical academic Bayes formula given two random variables is the following:

$$ \Pr(A|B) = \frac{\Pr(B|A) \Pr(A)}{\Pr(B|A) \Pr(A) + \Pr(B| \overline{A}) \Pr(\overline{A})} $$

However, searches on this site suggest that the denominator in Naive Bayes given three random vars is the following:

$$ \Pr(A|B,C) = \frac{\Pr(B|A) \Pr(C|A) \Pr(A)}{ \Pr(C|B)\Pr(B) + \Pr(B|C)\Pr(C) } $$

Where is the $A$ r.v. in the denominator of the Naive Bayes?

About to type an answer. Meanwhile, can you link to a post that gives the formula you saw? — Arya McCarthy, Apr 21 '21 at 22:10
@AryaMcCarthy There is a chance I misunderstood their answer https://stats.stackexchange.com/questions/404487/naive-bayes-computation-of-denominator — Edv Beq, Apr 21 '21 at 22:17

score 2 · Accepted Answer · answered Apr 21 '21 at 22:29

The question you linked to is by someone curious about an incorrect formula they found. The answer corrects it. The corrected formula does include $A$ in the denominator.

In your notation, the formula from the answer includes $A$ and is:

$$ \Pr(A \mid B, C) = \frac{ \Pr(B \mid A) \Pr(C \mid A) \Pr(A) }{ \Pr(B \mid A) \Pr(C \mid A) \Pr(A) + \Pr(B \mid \overline A) \Pr(C \mid \overline A) \Pr(\overline A) } $$

More in line with their notation, this is equivalent to

$$ \Pr(A \mid B, C) = \frac{ \Pr(B \mid A) \Pr(C \mid A) \Pr(A) }{ \sum_{A' \in \{A, \overline A\}}\Pr(B \mid A') \Pr(C \mid A') \Pr(A') } $$

Note that they make a conditional independence assumption. $B$ and $C$ are conditionally independent given $A$. That's why we can write $P(A, B, C)$ as $P(B \mid A) P(C \mid A) P(A)$ instead of $P(B \mid A, C) P(C \mid A) P(A)$.

Got it - the notation confused me but this makes sense. Thank you — Edv Beq, Apr 21 '21 at 22:32

On the denominator of Bayes and Naive Bayes

1 Answers1