Comparing two populations to calculate P(Ai>Bi)

Question

I have two non-normal populations of scores (A and B) and want to know the probability that a randomly selected score from A is greater than a randomly selected score from B. My plan was to estimate this probability by sampling multiple corresponding pairs from the two populations. But I have to repeat this on many pairs of populations and my question is whether sampling is the best option or is there a more efficient approach I could use, for example some kind of test that compares distributions that calculates the desired probability. I should note too that all values are integers and ties are quite possible Thanks

Values are integers from 0 to 100 but beyond that won't have any specific distribution. — Terry B, Mar 28 '18 at 20:05

Eivind Samuelsen · Accepted Answer · 2018-03-29T07:55:39.223

If there had been no ties in your scores, the probability would have been equal to $\frac{U}{mn}$ where $U$ is the Wilcoxon-Mann-Whitney U statistic and $n$ and $m$ are the sample sizes of the two groups. Using a some implementation of the WMW U test and extracting the test statistic might have been faster than randomly sampling many pairs, depending on what kind of accuracy you need for your estimates, and as a bonus you would have gotten a hypothesis test for the null hypothesis that $P(A > B) = P(A < B)$

However, since your scores are integers in the relatively small range $0, \ldots, 100$, we can do even better: count the number of occurrences in each group of each possible value $C(A=i)$ and $C(B=i)$. Then we can efficiently calculate the number of scores in $B$ less than $i$ as

$$C(B< i) = C(B < i-2)+C(B=i-1)$$

and finally calculate the probability

$$P(A > B) = \frac{1}{mn}\sum_{i=0}^{100}C(A=i)C(B<i)$$

This algorithm is $O(n + m)$. The last two steps can be rolled into one loop in languages where this is efficient, but in R for example, it would probably be best to implement it like this:

function(A, B)
{
  cA = tabulate(A+1, 101)
  cB = tabulate(B+1, 101)
  CB = cumsum(c(0, cB[1:100]))
  return(sum(cA*CB)/(length(A)*length(B)))
}

Which, when dropping the local variables and exploiting a quirk with the tabulate function be reduced to

function(A, B)
  sum(tabulate(A, 100) * cumsum(tabulate(B+1, 100)) / 
    (length(A)*length(B))

Thanks Eivind. I should have noted sooner but my data are integers 0:100, so ties are entirely possible. Does the above hold in cases with ties? — Terry B, Mar 28 '18 at 23:12
No, unfortunately not. In general $U/mn$ is equal to $P(A> B) + 0.5P(A=B)$, also known as the Vargha-Delaney A statistic, which is a commonly used effect size in some fields, but, of course, not what you were looking for. — Eivind Samuelsen, Mar 29 '18 at 06:12
Knowing that your observations are in a relatively small finite set opens up new possibilities, though, I will edit my answer. — Eivind Samuelsen, Mar 29 '18 at 06:23
Excellent. Thanks Eivind. I will give this a go. Much appreciated. — Terry B, Mar 29 '18 at 10:24

ReneBt · Answer 2 · 2018-03-28T13:19:48.107

What you are describing sounds identical to the purpose of the Area Under the Curve for the Reciever Operator Characteristics (AUC-ROC or AUROC), which is closely related to the Wilcoxon Signed Rank test ($AUROC = U/{mn}$) where $m$ and $n$ are your sample sizes for $A$ and $B$ and $U$ is the U test statistic.

The AUROC is often defined along the lines of the probability that a randomly selected sample in one group is ranked higher than a randomly selected sample from another.

A useful discussion is provided in an answer here What does AUC stand for and what is it?

It is rank based and so in non-parametric.

For a useful discussion on AUROC, see Hanley and MacNeil, Radiology, 1982, Vol 143, Pg 29

Comparing two populations to calculate P(Ai>Bi)

2 Answers2