ks_2samp test in Python scipy - low D statistic, low p-value?

Asked Sep 06 '17 at 07:48

Active Jul 26 '18 at 11:27

Viewed 699 times

As the heading says, I'm getting both D statistic and p-value to be low in ks_2samp test. More specificaly:

Ks_2sampResult(statistic=0.049890046265079313, pvalue=0.0011365796735152277)

I think these two results seem kind of contradictory. If the absolute difference between the two CDFs is 0.05, I would say they are mostly the same distribution and it's quite unintuitive and strange for me to see such a low p-value.

The sample size for both of my variables are over 1500. Both of them have range [0,1]. Now, I have found this post.

It seems that the p-value and D statistic both decrease as the size of the sample increases. This creates concerns for me about using this method for testing if two distributions are the same. I would like to hear more opinions about this, as I am pretty convinced now that this should not be trusted in my case. But if it is true that it's misleading here, then why should I trust it in any case?

edited Sep 06 '17 at 10:21

Sven Hohenstein

6,285
25
30
39

asked Sep 06 '17 at 07:48

Aleksandar Jovanovic

Would it be possible for you to add a code snippet to generate the given results? I am facing a similar issue and I'm trying to get to the bottom of it. – Luca Cappelletti Nov 24 '18 at 11:38

ks_2samp test in Python scipy - low D statistic, low p-value?

0 Answers0

Linked