I have two sequences of equal length, like [1, 1, 2, 2, 2, 3, 3, 3, 4]
and [1, 2, 2, 3, 3, 3, 3, 4, 4]
. I want to determine the odds that they were generated by the same random variable, or different ones. What is a good statistical test to do this?
The data are not pairwise; nor are they time series. Each element in the sequence is independent of the others.
(I'm sorry if I'm omitting necessary details; I'm new to stats.)