1

Possible Duplicate:
Probability distribution value exceeding 1 is OK?

I followed the machine learning course by Andrew Ng and decided to implement an anomaly detection system for one of my problems. This is how I have proceeded so far:

  1. I have about ($m=$)100k feature vectors which represent the normal behavior
  2. The values are scaled to the interval $[-1; 1]$ and the mean $\mu$ is calculated and substracted from all the values.
  3. Now the scaled feature vectors (data points) are inserted into a matrix so that each row represents one data point
  4. Next the covariance matrix is computed by the following equation $\frac{1}{m}*M^{T}M$
  5. In order to classify a new data point, I apply the same scaling as before and substract the mean. Let's call the resulting vector $x$
  6. This vector is plugged into the following formular: $p(x) = \frac{1}{(2\pi)^{\frac{n}{2}}|\Sigma|^{\frac{1}{2}}} e^{-0.5 x^{T}\Sigma^{-1}x}$

I assumed that $p(x) \in [0;1]$, but instead I get also values $> 1000$.

Do you have advice where I should go from here? Thanks.

bjoernz
  • 113
  • 3
  • Related: http://stats.stackexchange.com/questions/9427/interpreting-gaussian-probabilities-greater-than-1 (also closed as a duplicate). A duplicate Bayesian question is at http://stats.stackexchange.com/questions/13275/bayesian-probability-1-is-it-possible. – whuber Aug 30 '12 at 13:55

1 Answers1

4

The $p(x)$ you are using is the normal probability density. It is not a probability and can be greater than $1$. Bayes classification is determined by finding the class with the highest probability density when costs for the error types are the same.

Michael R. Chernick
  • 39,640
  • 28
  • 74
  • 143