I am trying to test whether means of two populations with lots of zeros are different. Here is the following python code example:
from scipy.stats import mannwhitneyu
import numpy as np
a = np.random.random(100)
b = np.random.random(100) * 2
aa = np.hstack((a, np.zeros(1000)))
bb = np.hstack((b, np.zeros(1000)))
np.random.shuffle(aa)
np.random.shuffle(bb)
mw_stat_1, p_value_1 = mannwhitneyu(a, b) # 100 obs in a and b
mw_stat_2, p_value_2 = mannwhitneyu(aa, bb) # 1000 zeors added to each a and b
# I take mean of sample of 20 elements, a and b become an array of 55 elements
samp_sum_aa = aa.reshape(-1,20).mean(axis=1)
samp_sum_bb = bb.reshape(-1,20).mean(axis=1)
mw_stat_3, p_value_3 = mannwhitneyu(samp_sum_aa, samp_sum_bb)
Result:
>>> p_value_1 # no zeros
2.5956488654494193e-09
>>> p_value_2
0.42124151395226317
>>> p_value_3 # using sampling
0.0020853586023447269
I find that if I do Mann-Whitney test on raw populations (after zeros are added), my p-value is large; however, if I take random samples, I get a small enough p-value, such that I can reject the null hypothesis for all practical purposes.
Is sampling a proper technique here? If so, how do I know what is the right sample size? Are there other methods to address this problem?