Suppose I take 2 sets ($A$ and $B$) of 1000 random items out the same distribution; I also suppose that all items are different. I then create a new set $C$ which is the union of $A$ and $B$, since all items are different it has 2000 elements. If $C$ is sorted (ascending or descending), should I roughly end up with a sequence: $a_0,b_1,a_2,b_3,a_4,b_5,...a_i,b_j,...b_{2000}$ ($a_i$ is an element of $A$ and $b_j$ is an element of $B$?
Or alternatively should the number of runs (defined by Wald-Wolfowitz) be almost equal to number of elements in $C$? Here a run is a follow up of elements from the same set. For example:
- $a,b,a,b$ has 4 runs
- $a,a,a,b,b$ has 2 runs
- $a,b,b,b,a,a$ has 3 runs
If this would not be case, what's the reason behind it?
To give a more practical setting, I also implemented this now and asked it on stackoverflow.