Using JMP, I was able to fit a distribution to a set of data, using the normal-2 mixtures model. It returns location (or mean), dispersion (standard deviation) and probability for each of the two normal distributions used to create the normal-2 mixtures. Now, I want to be able to take any data point from that population, and figure out the odds that data point came from each of the two distributions. Is there a way to do this?
-
2I can write JMP scripting code or SQL or something else to calculate the values if I have the formulas involved. – cwyers May 16 '14 at 19:32
1 Answers
Let's call the estimates from population 1 and population 2 to be
$\mu_1$ and $\mu_2$ for the means, $\sigma_1$ and $\sigma_2$ for the sd's. Also, let's define $p$ to be the estimated proportion of observations from population 1.
Then, for each observation $x_i$, the estimated probability of belonging to population 1 is
$= \dfrac{p*N( x_i ; \mu_1, \sigma_1)}{p*N( x_i ; \mu_1, \sigma_1) +(1-p)*N( x_i ; \mu_2, \sigma_2)}$
where $N( x_i ; \mu_1, \sigma_1)$ is the normal density. If using JMP, you could evaluate the normal density with
Normal.Density( (x_i - mu_1) / sigma_1) )
since the normal density function in JMP only accepts arguments for the standard normal distribution.
The estimated probability of belonging to population 2 would then of course be 1 minus the estimated probability of belonging to population 1.