Naive Bayes Assignment of Feature Probability

Question

I'm using the .show_most_informative_features() function from NLTK's Naive Bayes to generate features to be used with a lexicon. In the case of my binary-classification problem, these features are calculated as (where W = feature and V = class): $$ MAX(P(W|V^1) / P(W|V^2) $$

I just want to make sure that I understand how NLTK's NB calculates the probability of a feature belonging to a particular class. Essentially, I need someone to inform me how this is calculated:

$$ P(W|V^i) $$

I believe the type of NB NLTK uses is MultiNomial. An explanation sourced from this stackoverflow post:

The probability of a word given the tag is computed in the train() function using the Expected Likelihood Estimation from the ELEProbDist which is a LidstoneProbDist object under the hood where the gamma argument is set to 0.5, and it does:

class LidstoneProbDist(ProbDistI):

"The Lidstone estimate for the probability distribution of the experiment used to generate a frequency distribution. The "Lidstone estimate" is parameterized by a real number gamma, which typically ranges from 0 to 1. The Lidstone estimate approximates the probability of a sample with count c from an experiment with N outcomes and B bins as c+gamma)/(N+B*gamma). This is equivalent to adding gamma to the count for each bin, and taking the maximum likelihood estimate of the resulting frequency distribution."

How is this explanation represented as a step-by-step mathematical process?

Detailed explanation on stack. NLTK NB documentation. NLTK Probability documentation.

Sample Output:

Most Informative Features
             outstanding = 1                 pos : neg    =     13.9 : 1.0
               insulting = 1                 neg : pos    =     13.7 : 1.0
              vulnerable = 1                 pos : neg    =     13.0 : 1.0
               ludicrous = 1                 neg : pos    =     12.6 : 1.0
             uninvolving = 1                 neg : pos    =     12.3 : 1.0
              astounding = 1                 pos : neg    =     11.7 : 1.0

Whats the point of downvoting if you're not going to tell me what I've done incorrectly? — Laurie, Aug 17 '18 at 19:59
I agree; I will upvote to cancel the downvote. I believe there is some value to this question. I may be able to pull an answer to this later this weekend. — Jon, Aug 17 '18 at 21:58

Naive Bayes Assignment of Feature Probability

0 Answers0