8

What does it mean when people say that a t-test performed on ranked data is equivalent to a Mann-Whitney U-test? Does that mean they just test the same hypothesis/are useful in the same situations or are they are supposed to give the exact same p-values? The reason I ask is I tried both in R and compared two groups with very small sample sizes (3 and 4). I got completely different answers: one significant and one not.

The two groups are A=(1,2,3) and B=(4,5,6,7).

t-test: p = 0.01

Mann-Whitney U-test: p = 0.06

Silverfish
  • 20,678
  • 23
  • 92
  • 180
Jimj
  • 1,053
  • 1
  • 14
  • 25
  • It's impossible to get a significant MW result with Ns of 3 and 4. – Jeremy Miles Apr 05 '13 at 23:02
  • 1
    @JeremyMiles At least if you work at the 5% level, since the smallest attainable level is 5.7% -- but if your n's are perforce 3 and 4 respectively (rather than something you can change), one might reasonably criticize insisting on $\leq$5% in such circumstances; indeed, one might well mount an argument for a substantially higher $\alpha$, such as, oh, something more like, say about 11.4%. There's little point in keeping $\alpha$ very low if $\beta$ is really high. – Glen_b Jan 02 '15 at 01:29
  • @Glen_b good point, I should have qualified that. (And nice solution to use a higher alpha). – Jeremy Miles Jan 02 '15 at 16:53
  • @JeremyMiles the tradeoff between the two error rates depends heavily on context of course. – Glen_b Jan 02 '15 at 22:37

2 Answers2

7

Does that mean they just test the same hypothesis/are useful in the same situations or are they are supposed to give the exact same p-values?

It means:

(i) the test statistics will be monotonic transformations of each other.

(ii) they give the same p-values, if you work out the p-values correctly.

Your problem is that the t-test on ranks doesn't have the distribution you use when you use t-tables (though in large samples they'll be close). You need to calculate its true distribution to correctly calculate the p-value. It matters most for small samples ... but they're also the ones where you can calculate the actual distribution most easily.

Glen_b
  • 257,508
  • 32
  • 553
  • 939
  • So correct me if I am wrong but, what you are saying is the M_W test is giving the correct p-value while the t-test on ranks is giving a value which is slightly inaccurate. OK that helps. Thanks. – Jimj Apr 06 '13 at 00:53
  • Yes, that's it exactly. Indeed, if you can see that the two statistics are monotonic versions of each other, it's easy to see that this cannot be otherwise. Showing it is not difficult algebraically. – Glen_b Apr 06 '13 at 01:37
  • Well, it's possible that the M-W may be giving an approximation as well, depending on the circumstances, but in small samples most packages give the exact p-values. In larger samples they may switch to normal approximation (which tends to be pretty accurate unless there's a lot of ties). Where both are approximations, the M-W will generally be very close to the true p-value. – Glen_b Apr 06 '13 at 01:37
  • And the reason a ttest on ranks is not giving the exact right pvalue is because ranked data is not normal? – Jimj Apr 07 '13 at 18:24
  • 2
    Correct. Not only is ranked data not normal, which would be sufficient to make the result not have a t-distribution, but given the sample sizes, you also know the variance under the null. – Glen_b Apr 07 '13 at 22:51
  • Is calling it a "*t*-test" a slight misnomer, in the sense that the null distribution *isn't* actually *t*? Obviously the test statistic has a similar form to the conventional *t*-statistic... except presumably we'd make use of the known $\sigma_{ranks}$ under $H_0$, in which case it looks more like a $z$-statistic? (Though admittedly one without the $Z \sim \mathcal{N}(0,1)$ distribution.) – Silverfish Jan 02 '15 at 00:44
  • @Silverfish Well, yes (mostly) and no (a bit). If the intent (in the question) of '*a t-test performed on ranked data*' is that you use the t-tables (which while not correct don't give a bad approximation if your sample sizes aren't too small), then it makes sense to call it a t-test, even though the distribution you're looking up isn't quite right. If the intent is to use only the t-statistic with its correct distribution under the null, or at least a normal approximation in larger samples (and, frankly, if you have a computer, why wouldn't you?), then yes, I agree completely. – Glen_b Jan 02 '15 at 00:51
  • @Glen_b If we are naming it after the form of the test statistic, rather than the null distribution, but $\sigma$ is known from the outset, why *isn't* it a "*z*-test on the ranks"? The only basis I can see for calling it "*t*" is that would be the equivalent parametric test we'd use, but that would be a terrible reason, which makes me think I'm missing something. It seems to me the confusion underlying the original question is very much wrapped up in the naming issue. – Silverfish Jan 02 '15 at 01:02
  • @Silverfish I don't see where we're disagreeing. – Glen_b Jan 02 '15 at 01:12
0

N.B. Not really an answer per se...

Jonas Kristoffer Lindeløv wrote a recent blog post here that addresses relationships between linear models and group tests (such as t-test, Mann-Whitney, Wilcox, Kruskal-Wallis, ANOVA, etc.). He also created a somewhat limited simulation to assess the differences between the t-test and MW directly.

It would be nice to either have 1) an analytic calculation or 2) a strong set of simulations to create some good rules-of-thumb for when we can use a linear model on ranks as opposed to a nonparametric test, as these are sometimes more convenient.

abalter
  • 770
  • 6
  • 18