I'm using a Wilcoxon signed-rank test to compare two related samples of non-normally-distributed data. The data contains some ties, so when I run the test in R, I get this warning:
wilcox.test(data$valueA, data$valueB, paired=T)
Wilcoxon signed rank test with continuity correction
data: isnapRequests$fullMean and isnapRequests$template
V = 181, p-value = 0.07691
alternative hypothesis: true location shift is not equal to 0
Warning message:
In wilcox.test.default(wilcox.test(data$valueA, data$valueB, paired=T, :
cannot compute exact p-value with zeroes
I understand why the warning is given, since the test assumes a continuous distribution and my data is discrete (analogous to survey data). I also assume that the p-value reported by R is conservative, in the sense that the ties make it more difficult to reject the null hypothesis.
I read on this forum that one way around this problem is to jitter
the data and run multiple trials to estimate what the p-value would be without these ties:
ps <- sapply(1:1000, function(i) {
wilcox.test($valueA, jitter(data$valueB, amount=0.001), paired=T)$p.value
})
mean(ps < 0.05)
# 0.877, so 88% of trials were significant at 0.05
mean(ps)
# 0.03291361 is the average p-value over 1000 trials
This seems to be sufficient evidence that the null hypothesis should be rejected, but I'm not very familiar with this process, so I have three questions:
- Is this procedure acceptable and is my interpretation reasonable? I'm not trying to "p-hack", but I also want to report the difference if it's real.
- This process seems difficult to report in a scientific paper. Is there a commonly accepted way of describing this procedure that won't raise eyebrows?
- How would you report the test? As an averaged p-value? Would you also average test statistics? And then you have the SD of the p-value as well... This seems messy.
If there's a better way to approach this, I'm open to suggestions as well. Thanks!