Why does the Kruskal Wallis test shows a difference but Wilcoxon doesn't find any between the groups?

Question

Why does the Kruskal Wallis test shows a difference but Wilcoxon doesn´t find any between the groups, even tough the rank medians which are tested as far as I know look very different, and Dunn test as post hoc finds a difference?

Does it have to do with sample size n<5 so no chi-squared distribution?

Here is my test data for R + dunn.test library

test <- data.frame(TR=c("A","B","A","A","C","B","B","C","C","D","D","D"),REP=c(3,1,1,2,3,2,3,1,2,1,3,2),VAL=c(22,88,24,38,24,72,72,29,15,14,21,17))
attach(test)
boxplot(VAL~TR)
TR<-as.factor(TR)
shapiro.test(VAL)
kruskal.test(VAL~TR)
pairwise.wilcox.test(VAL,TR,p.adj='bonferroni')
dunn.test(VAL,TR, method = "bonferroni",altp=TRUE)

Possible duplicate of [Non-significant Kruskal-Wallis and significant post hoc tests](https://stats.stackexchange.com/questions/433141/non-significant-kruskal-wallis-and-significant-post-hoc-tests) — Alexis, Oct 31 '19 at 20:53
No, this is not necessarily a duplicate. The other question was asking if it was appropriate to exclude p-values for post-hoc pairwise comparisons with a nonsignificant omnibus test. The question in this thread is asking more for an explanation of why this might occur that the omnibus is significant but none of the pairwise comparisons are significant. This is different. The OP should look at the literature on appropriate pairwise comparisons after a KW test-- Wilcoxon rank sum is not necessarily the right approach. — LSC, Oct 31 '19 at 20:59
Also possibly relevant [Mann-Whitney (2 groups) contradicted by Kruskal-Wallis (3 groups)](https://stats.stackexchange.com/questions/401457/) (I mean aside from the issue that the Mann-Whitney test is inappropriate as a *post hoc* test for Kruskal-Wallis.) — Alexis, Oct 31 '19 at 21:16
Aside: different implementations of *post hoc* tests use different methods with different degrees of fidelity with respect to the seminal papers in which the tests were advanced: [What is the difference between various Kruskal-Wallis post-hoc tests?](https://stats.stackexchange.com/questions/141856/) — Alexis, Oct 31 '19 at 21:26
The issue of individual pairwise comparisons not exactly corresponding to the overall test happens even with say one-way anova. The non-rejections regions of the two kinds of test don't exactly correspond - they deal with slightly different questions. It's quite possible to get rejection with one but not the other, or vice versa. — Glen_b, Oct 31 '19 at 23:57

score 1 · Accepted Answer · edited Oct 31 '19 at 22:31

1

The Mann-Whitney-Wilcoxon rank-sum test is inappropriate for post hoc pairwise tests following the rejection of the Kruskal-Wallis null hypothesis because

(1) the rank-sum test uses different rankings of the data than were used in the Kruskal-Wallis test

(2) the rank-sum test does not use the pooled variance estimate implied by the Kruskal-Wallis null hypothesis.

By contrast Dunn's test (and the more powerful Conover-Iman test) do use the same rankings, and do use pooled variance estimates. We would therefore not expect to be surprised when the results of the rank-sum and the actual post hoc tests do differ.

edited Oct 31 '19 at 22:31

Nick Cox

48,377
8
110
156

answered Oct 31 '19 at 21:21

Alexis

26,219
5
78
131

Thanks a lot for your answer! Is there any reason to not use the conover.test because dunn.test is the most refered to poc host test for kruskal wallis? – Markie Oct 31 '19 at 21:42
@Markie Not in my book. The Conover-Iman test is just less well known. It gets it's power from constructing a t-test rather than a z-test (like Dunn's). – Alexis Oct 31 '19 at 23:27
Awesome, thanks again! – Markie Nov 01 '19 at 00:10

Why does the Kruskal Wallis test shows a difference but Wilcoxon doesn't find any between the groups?

1 Answers1

Linked