2

I have 3 samples of water contamination (bacteria/L) from a lake which I want to compare to another, more "pristine" lake. So I have 6 values in total, 3 belonging to each lake:

contamination,  lake type
7576.48,        pristine
7472.81,        pristine
7511.55,        pristine
6803.65,        test lake
6692.63,        test lake
7128.27,        test lake

What is the best test to compare them? In the boxplot they look different. Initially I thought about a Student-t test, but with N = 3 it looks like it will be rather poor, right? Should I go then directly for the Wilcoxon rank sum test instead? Or what do you do when you have just 3 cases for each group?

mdewey
  • 16,541
  • 22
  • 30
  • 57
terauser
  • 133
  • 1
  • 5
  • 1
    Also relevant: [1](http://stats.stackexchange.com/questions/44475/is-there-a-statistical-test-to-compare-two-samples-of-size-1-and-3) [2](http://stats.stackexchange.com/questions/21473/minimum-sample-size-for-unpaired-t-test/21474) – Gala Jul 02 '13 at 13:53

2 Answers2

2

This looks similar to the problem posted here: Student's t vs Mann-Whitney U for small equal samples . My answer there would apply a fortiori in your case, with now just n=3 instead of n=5 there.

Lucozade
  • 619
  • 1
  • 6
  • 7
2

My understanding is that the t-test is in principle appropriate, the main problem being that with so little data you have to assume that your data are normally distributed and there is no way to check that. One thing to note is that with 3 observations per group, the p-value from the Wilcoxon rank sum test cannot be smaller than .1, no matter what, so it does not seem particularly attractive as an alternative.

Plotting the data and observing that there is absolutely no overlap and a comfortable gap between the two groups seems a reasonable thing to do. I wouldn't necessarily recommend boxplots here, with only three points, the median is going to be the middle value and the quantiles cannot be estimated with any precision. Better just do a strip chart.

Gala
  • 8,323
  • 2
  • 28
  • 42
  • 1
    See also answers and comments at http://stats.stackexchange.com/questions/60725/validity-of-normality-assumption-in-the-case-of-multiple-independent-data-sets-w However, the implication seems to be that you really do have only 6 measurements, not lots of small samples. – Nick Cox Jul 01 '13 at 19:56