4

Say that we have three i.i.d random variables $X,Y,Z$. Each has pdf $f(\cdot)$ and cdf $F(\cdot)$, and furthermore, the difference of any two (e.g. $Y-X$) has pdf $f_d(\cdot)$ and cdf $F_d(\cdot)$.

The problem is to calculate the probability of these two events occurring: $Pr(Y-X<c\cap Z-X<d)$ for known $c,d$.

So, we want: $Pr(Y-X<c)\cdot Pr(Z-X<d|Y-X<c)$

At the moment, I'm specifically working on $X\sim N(0,1)$, so $Y-X \sim N(0,2)$, but knowing the general answer would be useful.

  1. Does this calculation depend on knowing whether $c>d$?
  2. What's the answer? :)

Any help would be greatly appreciated!

  • 1
    It looks like you're misapplying the product rule for probabilities. The product rule is used for *intersections* of events whereas your question is about a *union*. You probably want to use $P(A \cup B) = P(A) + P(B) - P(A \cap B)$. – tddevlin Jul 13 '17 at 14:54
  • Sorry, typo from `\cup` instead of `\cap`! It's the intersection that I'm after, indeed. Can you help? – David Smerdon Jul 13 '17 at 14:57

2 Answers2

4

Because $Y$ and $Z$ are independent, they are independent conditional on $X$, too: by definition, that means the conditional chance of the intersection of events is the product of their conditional chances. Therefore, since adding $X$ to both sides of all inequalities does not change them,

$$\eqalign{ \Pr(Y-X\lt c,\ Z-X\lt d) &= \Pr(Y \lt c+X,\ Z \lt d+X) \\ &=\int_\mathbb{R}\Pr(Y\lt c+X,\ Z\lt d+X\mid X=x)\mathrm{d}F(x)\\ &=\int_\mathbb{R}\Pr(Y\lt c+X\mid X=x)\Pr(Z \lt d+X\mid X=x)\mathrm{d}F(x) \\ &=\int_\mathbb{R} F(c+x)F(d+x)\mathrm{d}F(x). }$$

This does not simplify further in general.

  1. Evidently the order of $c$ and $d$ does not matter. (Since $Y$ and $Z$ are exchangeable, that conclusion doesn't require any calculation.)

  2. Even for Normal distributions this integral is likely to require numeric integration except in special cases. For instance, with $c=d=0$ the integral clearly is $\int_\mathbb{R}\mathrm{d}\left(\frac{1}{3}F(x)^3\right)=1/3$.

whuber
  • 281,159
  • 54
  • 637
  • 1,101
  • More simply for the case when $c=d=0$ and the problem is to evaluate $\Pr(Y-X<0, Z-X<0)$ which is the same as asking for the probability that $X$ is the largest of three i.i.d. random variables, the answer is $\frac 13$ by symmetry! – Dilip Sarwate Jul 13 '17 at 23:13
  • @Dilip That's right--it's one reason I selected this particular example, because it's obvious *a priori* that the result has to be $1/3$. The point is that the integral evaluates to $1/3$, too, giving us a little confidence in its correctness. – whuber Jul 13 '17 at 23:24
  • Thanks for the very helpful comments. I have been mulling through your answers and trying to apply them to the specific application where the distribution is Type-1 Extreme Value (or Gumbel). I know that if $X,Y \sim EV_1(\alpha,\beta)$ then $X-Y\sim Logistic(0,\beta)$, so I have been trying to simplify as @Dilip showed for $N(\mu,\sigma^2)$. I feel that it should fit the multnomial logit model but I can't seem to make it work. Any tips? (Should this be a separate question?) – David Smerdon Jul 19 '17 at 11:43
  • 1
    The main moral of this solution is to avoid using $X-Y$. In the case of the Gumbel, $F$ takes a particularly simple and tractable form: the integral I have derived is easy to compute. – whuber Jul 19 '17 at 12:12
  • Just a quick follow-up: In the case, of $X,Y,Z\sim Unif(0,1)$, then $F(x)=x$ and so $\int_\mathbb{R} F(c+x)F(d+x)\mathrm{d}F(x)$ simplifies to $\frac{1}{3} + \frac{(c+d)}{2} + cd$ (if I'm not mistaken). But we know $Pr(Y-X<1)=1$, so with $c=1,d=1$, our function should also equal 1. What am I doing wrong? – David Smerdon Jul 29 '17 at 12:20
  • @David Your expression for $F$ is incorrect. It should be equivalent to $F(x)=\max(0,\min(1,x))$. For the integral you should get various formulas depending on where $c,d,$ and $c-d$ compare to the numbers $-1,0,1$. That will produce 27 distinct formulas. For instance, with $c=d=1/2$, you should obtain $19/24$ for the integral. – whuber Jul 29 '17 at 13:43
2

An alternative calculation, applicable to the special case considered by the OP where $X,Y,Z$ are independent standard normal random variables (or, more generally, i.i.d. $N(\mu,\sigma^2)$ random variables for that matter) is to note that $Y-X$ and $Z-X$ are $N(0,2\sigma^2)$ random variables with correlation coefficient $$\rho = \frac{\operatorname{cov}(Y-X,Z-X)}{2\sigma^2} = \frac{\operatorname{var}(X)}{2\sigma^2} = \frac 12$$ and so for this special case, $\Pr(Y-X < c, Z-X < d)$ can be expressed in terms of the bivariate normal distribution function $L(h,k,\rho)$ as $$\Pr(Y-X < c, Z-X < d) =L\left(-\frac{c}{\sigma},-\frac{d}{\sigma},\frac 12 \right).$$ The function $L(h,k,\rho)$ is known to have value $L(0,0,\rho) = \frac 14+\frac{\arcsin \rho}{2\pi}$ which equals $\frac 14 + \frac{\pi/6}{2\pi} = \frac 13 ~ \text{when}~ \rho = \frac 12$ as in whuber's answer.

Dilip Sarwate
  • 41,202
  • 4
  • 94
  • 200