1

I have two sequences of equal length, like [1, 1, 2, 2, 2, 3, 3, 3, 4] and [1, 2, 2, 3, 3, 3, 3, 4, 4]. I want to determine the odds that they were generated by the same random variable, or different ones. What is a good statistical test to do this?

The data are not pairwise; nor are they time series. Each element in the sequence is independent of the others.

(I'm sorry if I'm omitting necessary details; I'm new to stats.)

Bumbo
  • 11
  • 1
  • 1
    Are you trying to test whether or not two independent samples come from the same distribution. If the sequences are assumed to be independent realizations from a given distribution then there are standard parametric and nonparametric two sample tests that could be applicable. If the data are time series or pairwise associated it changes the picture. You need to provide a lot more information. – Michael R. Chernick Sep 08 '12 at 03:04
  • I noticed an anonymous proposed edit on this post. If this comes from you, Bumbo, please let us know. Yet, you'll have to [register](http://stats.stackexchange.com/faq#login) your account to be able to edit your question yourself. – chl Sep 08 '12 at 10:17
  • It will depend on your specific problem and assumptions, but these links may provide some insight: http://stats.stackexchange.com/questions/1047/is-kolmogorov-smirnov-test-valid-with-discrete-distributions, and http://www.mathworks.com/matlabcentral/newsreader/view_thread/270729 – Nick Sep 08 '12 at 17:43
  • Significantly different with regard to what? Mean? Variance? Something else? If these aren't time-series, what makes them sequences? – Peter Flom Sep 29 '12 at 19:56

0 Answers0