Possible Duplicate:
Probability distribution value exceeding 1 is OK?
I followed the machine learning course by Andrew Ng and decided to implement an anomaly detection system for one of my problems. This is how I have proceeded so far:
- I have about ($m=$)100k feature vectors which represent the normal behavior
- The values are scaled to the interval $[-1; 1]$ and the mean $\mu$ is calculated and substracted from all the values.
- Now the scaled feature vectors (data points) are inserted into a matrix so that each row represents one data point
- Next the covariance matrix is computed by the following equation $\frac{1}{m}*M^{T}M$
- In order to classify a new data point, I apply the same scaling as before and substract the mean. Let's call the resulting vector $x$
- This vector is plugged into the following formular: $p(x) = \frac{1}{(2\pi)^{\frac{n}{2}}|\Sigma|^{\frac{1}{2}}} e^{-0.5 x^{T}\Sigma^{-1}x}$
I assumed that $p(x) \in [0;1]$, but instead I get also values $> 1000$.
Do you have advice where I should go from here? Thanks.