So, I have been trying to test if two independent samples come from the distribution, i.e. if they are greater or less than one another. Eventually I found out the Mann Whitney U Test is the appropriate test for me.
I came across scipy similar functions called scipy.stats.ranksums
and scipy.stats.mannwhitneyu
, which reference the same base theory for comparing independent samples. However, I have looked over the entire internet and could not find why the heck these functions provide distinct results, and unfortunately I was not smart enough to reverse engineer one to the other.
I would be very pleased if someone could enlighten me with any answer.
Example code:
from scipy.stats import ranksums,mannwhitneyu
rng = np.random.default_rng(seed=42)
sample1 = rng.normal(0, 1, 100)
sample2 = rng.normal(1, 1, 100)
print(ranksums(sample1, sample2,alternative='less'))
print(mannwhitneyu(sample1,sample2,alternative='less',use_continuity=False))
-----------------------------------------------------------------------------
RanksumsResult(statistic=-7.122478605972594, pvalue=5.300160604890462e-13)
MannwhitneyuResult(statistic=2085.0, pvalue=5.300160604890462e-13)
code for ranksums: https://github.com/scipy/scipy/blob/v1.7.1/scipy/stats/stats.py#L7713-L7787
code for mannwhitneyu: https://github.com/scipy/scipy/blob/v1.7.1/scipy/stats/_mannwhitneyu.py#L181-L424
EDIT: I am interested in using the statistic result of the Mann Whitney test as a measure of an AUC-ROC for a machine learning project I have been working on. Only the .mannwhitney() function gives me the desired results and looks like .ranksums() outputs a sort of z-score (like pointed in the results). Still, it would be nice to know why is that.