p(x) in an anomaly detection system gives values greater 1

Question

Possible Duplicate:
Probability distribution value exceeding 1 is OK?

I followed the machine learning course by Andrew Ng and decided to implement an anomaly detection system for one of my problems. This is how I have proceeded so far:

I have about ($m=$)100k feature vectors which represent the normal behavior
The values are scaled to the interval $[-1; 1]$ and the mean $\mu$ is calculated and substracted from all the values.
Now the scaled feature vectors (data points) are inserted into a matrix so that each row represents one data point
Next the covariance matrix is computed by the following equation $\frac{1}{m}*M^{T}M$
In order to classify a new data point, I apply the same scaling as before and substract the mean. Let's call the resulting vector $x$
This vector is plugged into the following formular: $p(x) = \frac{1}{(2\pi)^{\frac{n}{2}}|\Sigma|^{\frac{1}{2}}} e^{-0.5 x^{T}\Sigma^{-1}x}$

I assumed that $p(x) \in [0;1]$, but instead I get also values $> 1000$.

Do you have advice where I should go from here? Thanks.

Related: http://stats.stackexchange.com/questions/9427/interpreting-gaussian-probabilities-greater-than-1 (also closed as a duplicate). A duplicate Bayesian question is at http://stats.stackexchange.com/questions/13275/bayesian-probability-1-is-it-possible. — whuber, Aug 30 '12 at 13:55

score 4 · Accepted Answer · edited Aug 30 '12 at 12:47

4

The $p(x)$ you are using is the normal probability density. It is not a probability and can be greater than $1$. Bayes classification is determined by finding the class with the highest probability density when costs for the error types are the same.

edited Aug 30 '12 at 12:47

answered Aug 30 '12 at 12:08

Michael R. Chernick

39,640
28
74
143

p(x) in an anomaly detection system gives values greater 1

1 Answers1