Why does Multinomial Naive Bayes work well on discrete features?

Question

I understand Multinomial Naive Bayes is a specific instance of Naive Bayes when the data distribution is assumed to be multinomial.

In the sklearn documentation for Multinomial Naive Bayes, it is stated:

Naive Bayes classifier for multinomial models

The multinomial Naive Bayes classifier is suitable for classification with discrete features
(e.g., word counts for text classification). The multinomial distribution normally requires
integer feature counts. However, in practice, fractional counts such as tf-idf may also work.

I observed this to be true when working on a text classification project, the Multinomial Naive Bayes classifier had the best results among the others I tried. Can someone explain why it is suited for discrete features?

The multinominal distribution is a distribution over counts of events. — Arya McCarthy, Apr 18 '21 at 16:16

score 0 · Answer 1 · answered Apr 18 '21 at 21:31

0

Multinational naive Bayes algorithm is a generalization of naive Bayes algorithm for case where your predicted variable is not binary, but has more categories. Why it gave you best results as compared to other algorithms? You were probably lucky. There’s no single best algorithm.

For more details on why naive Bayes works, check Why do naive Bayesian classifiers perform so well?

answered Apr 18 '21 at 21:31

Tim

108,699
20
212
390

Thank you for your answer, I agree that there is no single best algorithm but my question is why the multinomial naive bayes works well on *discrete features*, not why the naive bayes works well in general. – user42 Apr 19 '21 at 07:24
@user42 this statement is incorrect. Multinomial naive Bayes and Bernoulli naive Bayes differ only in the format of the predicted data, the algorithm is exactly the same in both cases. In both cases, you cannot use continuous features, you need to either transform them or use a different algorithm than the vanilla one https://stats.stackexchange.com/questions/218492/how-does-naive-bayes-work-with-continuous-variables The quote above talks only about approximating counts with non-discrete values, so pretending you have discrete data. – Tim Apr 19 '21 at 11:48

Why does Multinomial Naive Bayes work well on discrete features?

1 Answers1