0

I understand how to use NB and have used it often. However, I am trying to understand how the two different ways I use to calculate the evidence (P(E)) result in the same figure.

The simplest way I calculate P(E) is the most straightforward:

number of times the evidence appears in the dataset / dataset size

However, I have also been taught and use another method:

P(E) = P(E|H)P(H) + P(E|¬H)P(¬H)

where ¬ means NOT.

The different methods result in the same answer - I just struggle to see how and why it would be used when the other method is simpler? I think of the second method as normalising (that makes sense to me); I don't see how it equates to evidence, though.

monster
  • 123
  • 4
  • 1
    Your second equation is an example of the law of total probability: see https://en.wikipedia.org/wiki/Law_of_total_probability and https://en.wikipedia.org/wiki/Law_of_total_expectation – Adrian Apr 27 '16 at 16:09

1 Answers1

1

If there are $N$ observations, of which

  • $n_E$ had the event $E$ true,
  • $n_H$ had the event $H$ true,
  • $n_{E,H}$ had the event $E$ true and the event $H$ true,
  • $n_{¬ H}$ had the event $H$ not true,
  • $n_{E,¬ H}$ had the event $E$ true and the event $H$ not true,

then $N=n_H+n_{¬ H}$ and $n_E =n_{E,H} +n_{E,¬ H}$

You can then say $$P(E|H)P(H) + P(E|¬H)P(¬H) = \dfrac{n_{E,H}}{n_H}\dfrac{n_{H}}{N}+\dfrac{n_{E,¬H}}{n_{¬H}}\dfrac{n_{¬H}}{N} $$ $$=\dfrac{n_{E,H}}{N} +\dfrac{n_{E,¬H}}{N} =\dfrac{n_{E,H}+n_{E,¬H}}{N} =\dfrac{n_{E}}{N} =P(E)$$ which is the observation you have made

Henry
  • 30,848
  • 1
  • 63
  • 107
  • Thank you. This answer helped answer my own question here: https://stats.stackexchange.com/questions/395291/summing-posterior-probability-of-naive-bayes. – Yu Chen Mar 03 '19 at 17:20