3

I am trying to figure out how to calculate the probability that the mean of one normal distribution is greater than the mean of another normal distribution, where I set a normal-gamma prior on each distribution.

More specifically, here is my setup

$ \begin{eqnarray} x_i | \theta, \tau & \sim & N(\theta, 1 / \tau) \\ \theta | \tau & \sim & N(\mu_0, 1 / (\lambda_0 \tau)) \\ \tau | \alpha_0, \beta_0 & \sim & Gamma(\alpha_0, \beta_0) \end{eqnarray} $

After observing $x_1, \dots, x_n$, I update the hyper parameters for the first distribution to get $\mu_1^x$, $\lambda_1^x$, $\alpha_1^x$, $\beta_1^x$. And after observing $y_1, \dots, y_m$, I update the hyper parameters of the second distribution to get $\mu_1^y$, $\lambda_1^y$, $\alpha_1^y$, $\beta_1^y$.

Now, I want to calculate the probability $P(\theta^x > \theta^y)$, given the normal-gamma distribution for each parameter.

I realized that I could simulate $\theta_1^x, \dots, \theta_K^x$ and $\theta_1^y, \dots, \theta_K^y$ from each normal-gamma distribution and then count the proportion of times where $\theta_i^x > \theta_i^y$. I did this by simulating $K$ values of $\tau$ from a $Gamma(\alpha_1, \beta_1)$ distribution. And then, for each value of $\tau$, simulate one value of $\theta$ from a $N(\mu_1, 1 / (\lambda_1 \tau))$ distribution.

However, I am looking for an analytic solution to this problem or one that does not rely as heavily on simulation. Can anyone help me out? Thanks.

  • What's your use case? Are you just going to sample from the Bernoulli distribution with that probability? – Neil G Oct 10 '17 at 04:03
  • What priors did you have in mind for $\mu$ and $\lambda$? They don't seem to be explicitly defined. – Maurits M Oct 10 '17 at 09:37
  • My use case is A/B testing data that is normally distributed. $\theta^x$ is the mean of data from the control and $\theta^y$ is the mean of the data from the treatment group. – Michael Frasco Oct 10 '17 at 16:43

1 Answers1

6

I am very much unsure there is a closed-form analytical solution to your question: the marginal distributions of $\theta^x$ and $\theta^y$ are Student's $t$ $\mathcal{T}_{\nu^x}(\mu^x,\omega^x)$ and $\mathcal{T}_{\nu^y}(\mu^y,\omega^y)$. Hence $\theta^x-\theta^y$ is distributed as the difference of two location-scale Student's $t$ variates, which is not another Student's $t$ variate, as discussed in this question.

Xi'an
  • 90,397
  • 9
  • 157
  • 575