I'm struggling with comparing samples with multiple response variables, where such variables have a strict precedence or importance order. For the sake of this exercise, let's assume that each sample is a non-negative integer triple $(A, B, C)$. Therefore, if we have two triples $t_1 = (a_1, b_1, c_2)$ and $t_2 = (a_2, b_2, c_3)$, we say that $t_1 < t_2$ only and if only $a_1 < a_2$, or $a_1 = a_2$ and $b_1 < b_2$, or $a_1 = a_2$ and $b_1 = b_2$ and $c_1 < c_2$ (and, of course, we can expand this for any number of dimensions). In order words, we have a lexicographical ordering. So, for example, assume the following made-up example:
Treatment 1 | Treatment 2 | ||
---|---|---|---|
Sample | Response | Sample | Response |
1 | (7, 3, 8) | 1 | (7, 7, 1) |
2 | (3, 2, 7) | 2 | (3, 2, 7) |
3 | (5, 6, 1) | 3 | (5, 8, 1) |
4 | (5, 6, 8) | 4 | (5, 7, 8) |
5 | (5, 7, 3) | 5 | (6, 7, 3) |
Now, I want to compare these two treatments and see whether we have a statistical difference between them. I must say that all the observations from both groups are independent of each other. However, with a tuple, the value may be dependent (but there is no clear function). I.e., in general, when we decrease the most important metric, the following metrics increase.
Usually, my date is not normal and I use the Wilcoxon Mann-Whitney Rank Sum U Test. But this occurs for single response variables.
Note that the U test looks natural for this case since I'm comparing ranks directly here. In other words, the computation does not depend on the values per se, but on the ordering (rank) they generate. And, since to order these tuples lexicographically is simple, a reimplementation of the U test looks suffice.
However, it is unclear whether a straightforward adaptation of the U test is correct, statistically speaking.
Do you have some clue how I should do this?
I could perform a linear combination of the tuples' elements and generate a single number per experiment. Using proper multipliers, we ensure that the final value reflects the magnitude of each individual value, i.e., sorting the generated single numbers results in the same lexicographical order of the original tuples. For example, if we know that each individual value is upper-bound by M, we can have the following multipliers:
- $M_3 = 1$
- $M_2 = (M + 1) \cdot M_3$
- $M_1 = (M + 1) \cdot M_2$
So, we have
$\mathit{single\_value} = M_1 \cdot a + M_2 \cdot b + M_3 \cdot c$
(of course, we don't need $M_3$, but just for the sake of completeness).
The problem with this approach is that, in my case, these Ms are pretty large numbers and sometimes lead to numerical instability. So, this option looks not to be the best.
Thanks,
Carlos