4

I have a simple question - I think.

I have recently read a paper:

https://www.google.co.uk/url?sa=t&source=web&rct=j&url=http://www.cs.columbia.edu/~kewang/paper/DMSEC-camera.pdf&ved=0ahUKEwiI9um_8LbKAhWDMBoKHSmlDPgQFggaMAA&usg=AFQjCNFYXFLNLcWt350pbZMKhOB9MJu_Yw

That uses a one class naive bayes. My question is - can I do the same as a one class multinomial bayes when I use a Gaussian distribution.

The above paper used a threshold to identify their class of interest in a test dataset.

If I make the following assumptions:

The standard deviation is greater than one for my features in the training data

Add the log sums of the Gaussian pdfs for all variables for each sample

Could I use a threshold, some standard deviation derived from the normal - maybe 3, to identify data points that are close to my one class training data.

Tim
  • 108,699
  • 20
  • 212
  • 390

1 Answers1

2

According to the paper One-class document classification via Neural Networks of Manevitz and Yousef it seems to be possible to construct a one-class Naive Bayes classifier, even without a standard deviation.

I cite the relevant passage where the authors mention how to implement the core of the classifier:

We calculate $p(d|E)$ as the product of $p(w|E)$ for all words in the dictionary that appear in the document $d$. Each of the $p(w|E)$ is estimated independently using the formula:

$p(w|E) = \dfrac{n_w + 1}{n + |dictionary|}$,

where $n_w$ is the number of times word $w$ occurs in $E$, and $n$ is the total number of words in $E$. We calculate a threshold $\delta$ by the minimum over all examples in $E$, of the value $p(d|E)$ for each document in the set of examples. Then we experiment with values $\lambda\cdot\delta$ for $0 < \lambda \leq 1$ as in the previous algorithms using $F_1$ to find the optimal threshold for acceptance. That is, given a new document $d$, we accept it if the calculated value $p(d|E)$ is larger than the determined $\lambda\cdot\delta$. For this classifier algorithm we store $\delta$ and $\lambda$.

A more detailed picture of the algorithm is explained in the doctoral dissertation Characteristic Concept Representations of Piew Datta.

NeuroMorphing
  • 525
  • 2
  • 12