Kolmogorov-Smirnov two-sample test

Question

The Suanshu Statistics Library supports the "Kolmogorov-Smirnov two-sample test" by rejecting the null hypothesis if $p$-value if smaller than significance level $\alpha$ (e.g., $p<0.05$).

My question is whether there is any method to check if the "test statistic" exceeds the critical value (for $\alpha = 0.05$) to reject the "null hypothesis" in Suanshu Library or any other statistics library?. If not, is there any formula to compute the critical values by ourself for two-sample K-Smirnov test to check against test statistic.

Any suggestions regarding this are welcome.

score 4 · Accepted Answer · edited Jun 11 '20 at 14:32

4

I am assuming you are asking because the Suanshu help page reports in reference to the K-S distribution, "This is not done yet." Luckily, it is very easy to do in R. If x and y are your two samples, ks.test(x,y) returns the test statistic and pvalue. For example,

> x <- rnorm(50)
> y <- runif(30)
> ks.test(x, y)    
        Two-sample Kolmogorov-Smirnov test    
data:  x and y 
D = 0.5, p-value = 9.065e-05
alternative hypothesis: two-sided

By default, it will compute exact or asymptotic p-values based on the product of the sample sizes (exact p-values for n.x*n.y < 10000 in the two-sample case), or you can specify this option with a third argument, exact=F or exact=T. Exact p-values are calculated using the methods of Marsaglia, et al. (2003), which the Suanshu documentation also cites. Some large sample approximations are given here, although I don't have a proper citation. Lastly, if you don't want to install R, there are web calculators for the two-sample K-S test, although I don't know if they use the same algorithm as R because the one I found only reported three decimal points for the p-value.

edited Jun 11 '20 at 14:32

Community

1

answered Aug 01 '11 at 15:06

lockedoff

1,795
2
12
19

@lockedoff.Your Explanation Helped me alot. As far as I know all the Statistics Libraries uses P-Value to reject the Null Hypothesis,if p-value if smaller than significance level α (p<0.05/0.01) (or) Other option is to check if the "test statistic(D)" exceeds the critical value of α to reject the Null Hypothesis. My confusion over this is,what is the appropriate one which give exact result if I want to reject the Null Hypothesis.. Thanks – Sam Aug 01 '11 at 16:30
@Sam Use either -- they give the same result (the relationship is 1-1 for fixed sample sizes and test size $\alpha$). Typically, you would report the test statistic and p-value, but not the critical value. E.g., "There was a statistically significant difference between the two distributions according to the two-sample Kolmogorov-Smirnov test (D = .6, p < .0001)." If $\alpha = .05$ is not an accepted convention in your field, report once that all of your statistical tests were of size $\alpha = .05$ (e.g., in your Methods section). – lockedoff Aug 01 '11 at 17:09

score 1 · Answer 2 · edited May 21 '20 at 16:53

SuanShu's Kolmogorov-Smirnov package can do anything R can do. The software computes both D, and p-values, just exactly in R. You can then use p-value to compare with whatever critical value threshold you want. In fact, SuanShu gives you the whole K-S distributions for different cases.

In addition, as pointed out in R's help page, R code does not handle duplicates. In this case, SuanShu is more precise in cases where there are duplicates.

See the examples here:

http://redmine.numericalmethod.com/projects/public/repository/entry/Examples/src/main/java/com/numericalmethod/suanshu/examples/HypothesisTesting.java

https://github.com/nmdev2020/SuanShu

https://nm.dev

Kolmogorov-Smirnov two-sample test

2 Answers2

Linked