3

Suppose that $W$ has a discrete uniform distribution on $\{1,\cdots,n\}$. Further, suppose that given $W=w$, the random variables $X$ and $Y$ are independently identically distributed geometric random variables with pmfs:

$f_{X|W=w}(x)=w^{-1}(1-w^{-1})^{x-1}, x=1, 2, \cdots$

$f_{Y|W=w}(y)=w^{-1}(1-w^{-1})^{y-1}, y=1, 2, \cdots$

How can I find $P(X \ne Y)$?

I was thinking that I could potentially find the joint distribution of $X,Y,W$, then sum out the $W$ and obtain the joint distribution of $X,Y$. Then, I could somehow leverage this new distribution to obtain $P(X \ne Y)$. In this case, would we get the following?

$f_{X,Y,W}(x,y,w)=f_W(w)f_{X|W=w}(x)f_{Y|W=w}(y)$

This seems difficult to sum out the $W$, so would there be a more efficient way to solve this problem?

Jen Snow
  • 1,595
  • 2
  • 18

1 Answers1

2

Let's begin by solving the general problem of determining the chance that $X\ne Y$ when $X$ and $Y$ are independent random variables.

It's almost always easier to subtract the chance that $X=Y$ from $1.$ The only values that contribute nonzero amounts are those numbers $z$ for which both $\Pr(X=z)$ and $\Pr(Y=z)$ are nonzero and, because the variables are independent, their collective contributions are

$$\Pr(X=Y) = \sum_{z\mid \Pr(X=z)\gt 0 \text{ and } \Pr(Y=z)\gt 0} f_X(z)f_Y(z).$$


For the variables in the question the sum evidently extends over all positive integers (so I'll write "$i$" instead of "$z$" as a reminder of that). Conditional on $W=w,$ we have

$$\Pr(X=Y\mid W=w) = \sum_{i=1}^\infty f_X(i)f_Y(i) = \sum_{i=1}^\infty \frac{(1-1/w)^{2(i-1)}}{w^2} =\frac{1}{2w-1}$$

which is obtained as the sum of a geometric series with common ratio $(1-1/w)^2.$ Consequently,

$$\Pr(X=Y) = \sum_{w=1}^n \Pr(X=Y \mid W=w) \Pr(W=w) = \frac{1}{n} \sum_{w=1}^n \frac{1}{2w-1} = \frac{H(2n)-H(n)/2}{n}$$

where $H(n) = 1 + 1/2 + 1/3 + \cdots + 1/n$ is the nth Harmonic Number. A good approximate formula is $H(n) \approx \log(n) + \gamma + O(n^{-1})$ ($\gamma \approx 0.577$ is Euler's number), whence

$$\eqalign{ \Pr(X\ne Y) &= 1 - \Pr(X=Y) \\&= 1 + (H(n)/2 - H(2n))/n \\ &\approx 1 - \frac{\log n}{2n} - \frac{\log 4 + \gamma}{2n} + O(n^{-2}). }$$

This achieves both an exact answer and an excellent approximation in terms of familiar functions (which helps us all appreciate how the value varies with $n$).

whuber
  • 281,159
  • 54
  • 637
  • 1,101