Probability of data point being from distribution in normal mixtures

Question

Using JMP, I was able to fit a distribution to a set of data, using the normal-2 mixtures model. It returns location (or mean), dispersion (standard deviation) and probability for each of the two normal distributions used to create the normal-2 mixtures. Now, I want to be able to take any data point from that population, and figure out the odds that data point came from each of the two distributions. Is there a way to do this?

I can write JMP scripting code or SQL or something else to calculate the values if I have the formulas involved. — cwyers, May 16 '14 at 19:32

score 4 · Accepted Answer · edited May 17 '14 at 09:53

Let's call the estimates from population 1 and population 2 to be

$\mu_1$ and $\mu_2$ for the means, $\sigma_1$ and $\sigma_2$ for the sd's. Also, let's define $p$ to be the estimated proportion of observations from population 1.

Then, for each observation $x_i$, the estimated probability of belonging to population 1 is

$= \dfrac{p*N( x_i ; \mu_1, \sigma_1)}{p*N( x_i ; \mu_1, \sigma_1) +(1-p)*N( x_i ; \mu_2, \sigma_2)}$

where $N( x_i ; \mu_1, \sigma_1)$ is the normal density. If using JMP, you could evaluate the normal density with

Normal.Density( (x_i - mu_1) / sigma_1) )

since the normal density function in JMP only accepts arguments for the standard normal distribution.

The estimated probability of belonging to population 2 would then of course be 1 minus the estimated probability of belonging to population 1.

Probability of data point being from distribution in normal mixtures

1 Answers1