5

Currently, I was wondering can we just compare two random variables just like the way we compare two real numbers? Does that make sense? Like for instance, $X$ and $Y$ are two random variables, does $X>Y$ mean something? Or is it plainly nonsense? Anyone has some ideas? Your opinion is greatly appreciated!

shijing SI
  • 593
  • 1
  • 6
  • 14
  • 2
    It depends entirely on the context. The indicator $\mathcal{I}\{X>Y\}$ is a random variable that has a Bernoulli distribution with success probability $p = P(X>Y)$. So, $X>Y$ gives you a single realization from that distribution - a sample of such values will give you information about $p$ - whether or not this is meaningful depends on the context - can you provide more info? – Macro May 26 '12 at 05:20
  • As Macro says, you can calculate $p={\mathbb P}(X –  May 26 '12 at 10:15
  • What properties do you want $\gt$ to have? Normally, this symbol is reserved for a *transitive* relation. Now, one can find *many* transitive relations on RVs merely by comparing some numerical property: for instance, we might say that $X\gt Y$ iff the median of $X$ exceeds the median of $Y$, or iff the first quartile of $X$ exceeds the first quartile of $Y$, etc. Although not nonsense, such relations are not very deep or interesting. What more might you be looking for in the $\gt$ relation? – whuber May 26 '12 at 20:35

3 Answers3

3

I agree with the comments by Procrastinator and Macro but I think your question is clear and has a direct answer without going into the issue of evaluating it in terms of a probability. If X and Y are random variables with values on the real line or any other space that can be ordered. then {X>Y} has meaning as a measureable event. So yes it is meaningful. Under those circumstances there exists a joint probability measure on the set of pairs (x,y) that are values that can be taken on by X and Y. If this probability measure has a density integrating the joint density over the set of points where X>Y gives the probability that X is greater than Y. For discrete distributions this is done by summing the probability over all the discrete points where X>Y.

Michael R. Chernick
  • 39,640
  • 28
  • 74
  • 143
  • (+1) As long as you can compare $X$ and $Y$ (_i.e._ not if they are multidimensional etc.), it has a meaning. – gui11aume May 26 '12 at 17:26
  • @gu11aume In 1-dimensional Euclidean space there is a natural order to numbers. But there are other spaces that orderings can be defined. Distance functions define a partial ordering because many points have the same distance. But for example under Euclidean distance in R$^2$ either X>Y X=Y or XY] based on the distance function. – Michael R. Chernick May 26 '12 at 17:36
  • That comment is incorrect, Michael: $\mathbb{R}^2$ cannot be ordered based on Euclidean distance. But the general statement in your answer *is* correct, because you do not need the relation $\gt$ on $\mathbb{R}^2$ to be an ordering at all: it need only be a measurable subset for it to be usable for comparing two real-valued random variables. – whuber Oct 23 '19 at 15:33
1

As far as I know, this statement makes sense in two contexts--as the definition of an event or as a constraint on a variable--but is a little imprecise.

Marco, Procrastinator, and Michael Chernick have discussed the "event" part above, but a concrete example might help. Suppose you have two random variables $X$ and $Y$ which are determined by rolling a pair of fair dice. You could be interested in the event ${X>Y}$ i.e., how often the number on the first die is strictly larger than the number on the second. In this particular case $P(X > Y)=15/36$, which is easy enough (in this case) to get by enumerating all the possibilities.

It also makes sense when defining random variables. I can't think of a good example off the top of my head, but imagine that $X$ is a person's age and $Y$ is how long they've had some disease. Then, obviously, we know that $Y \le X$.

That said, in general, I'm not sure it makes a ton of sense to compare two arbitrary random variables, at least not without defining exactly what you mean by less-than, greater-to, and equal up front. In a casual context, I can imagine someone using $X > Y$ to mean $\mathbb{E}(X) > \mathbb{E}(Y)$. It would be better to be more explicit about it, since it could also refer to other things (e.g., median, upper bound, etc). Plus, keep in mind that some of these quantities aren't defined for all random variables: for example, a Cauchy-distributed random variable doesn't have a mean.

Finally (and pedantically), random variables often have units. If $X$ is the balance of my retirement account and $Y$ is the number of viral particles in a sample, it doesn't make a whole lot of sense to talk about $X > Y$, or even $\mathbb{E}(X) > \mathbb{E}(Y)$.

Matt Krause
  • 19,089
  • 3
  • 60
  • 101
1

Ordering random variables is a thing, e.g., the concept of stochastic dominance. Basically, the ordering is based on the n-th order moments and/or cumulative distributional function (CDF) of a random variable. However, the ordering is partial, not total, i.e., you cannot induce an ordering into the entire space of random variables; in other words, some variables are not comparable.

thanhtang
  • 241
  • 2
  • 4