2

I'm stuck when looking at the following problem say I have two samples of different sizes drawn from a continuous distribution.

$(x_1,x_2,...,x_k)$ and $(y_1, y_2,..., y_m)$

What would be the best way to evaluate the probability that they are both drawn from the same distribution?

2 Answers2

2

I think a classical approach is Kolmogorov Smirnov two sample test. See wikipedia dedicated page. Basically the KS 2-sample test the null hypothesis that both samples were drawn from the same distribution. However you can consider more powerful and insightful modern methods like this one described here which compares the quantiles of both distributions. The last method gives more information related with how the differences are structured, if there are significant differences.

rapaio
  • 6,394
  • 25
  • 45
1

You can try Mann-Whitney-Wilcoxon test and Wald-Wolfowitz two sample runs test.

Both are non-parametric tests that can be used with unequal sample sizes. The samples being tested are assumed to be independent of each other.

The test will give you a p-value; it will tell you how likely it is to get this large a test statistic (or larger) if the two samples were actually drawn from the same population.

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219