How to deal with drastically different sample sizes (3 treatments)

Question

I have 3 treatments with different sample sizes. They are different enzymes for production of Butanol in industrial scale.

The first one has 60 replicates, the second has 30 replicates and the thrid with only 6 replicates. I tried running ANOVA followed by Tukey's Test, but I think it is not fair since the standard deviation is bigger for the treatment with small sample size.

I was wondering if there is a way in R to resample/bootstrap/reduce the first two treatments to a sample size of 6 and then, run ANOVA followed by Tukey's Test.

Anyone with advice or other suggestion?

Why is that unfair? This seems consistent with the small sample size, and is a feature, rather than a drawback, of the this procedure. — dimitriy, Jun 26 '19 at 00:20
Because with a big deviation there isn't statistical difference between this treatment and the others. We didn't expect it since this is the treatment with best results in lab scales, but we have never conducted an experiment with bigger sample size before and it was not possible to increase the last group with more than 6 replicates. Also, we've heard that the small sample size interfere in ANOVA because of normality, but I do not know if it's right (I'm biologist trying to apply statistics when it is possible). — user2501348, Jun 26 '19 at 00:30
Can you explain more about the "we've heard" thing? It's pretty vague. People hear lots of weird things with no clear provenance, from unnamed sources, and I'm somewhat mystified by what the actual claim could have been. What exactly was it that was said? — Glen_b, Jun 26 '19 at 01:02
During college I got the idea that for statistics the bigger sample size, the better. Also, that is necessary groups with same variance and normal distribution for ANOVA. But then comes the real life it is not pratical. So, looking for solutions in forums, I found answers as "the problem does not lie in the difference between the sizes of the datasets but rather in the fact that one data set is very small" or "if you have very unequal sample sizes, you may wish to use bootstrapping instead since it doesn't make any assumptions at all about the distribution of the sample statistic". — user2501348, Jun 26 '19 at 01:29

BruceET · Accepted Answer · 2019-06-26T07:50:53.850

In a traditional one-factor ANOVA with three treatment groups, one assumes that the three populations have equal variance. If sample sizes are hugely different, then the common variance is mainly estimated from the larger sample(s).

If you doubt that all three treatment groups have different variances, then that assumption seems inappropriate. However, there is a Welch version of the one-factor ANOVA that does not assume equal variances. (The main idea is somewhat similar to doing a Welch two-sample t test instead of a pooled two-sample t test.) See R documentation for 'oneway.test'.

Here is an example:

set.seed(626)  # for reproducibility of simulated data
x1 = rnorm(60, 100, 10);  x2 = rnorm(30, 100, 12);  x3 = rnorm(6, 120, 10)
x = c(x1, x2, x3)
boxplot(list(x1, x2, x3), varwidth=T, col="skyblue2")

Boxes above are of different sizes to reflect the different sample sizes among groups.

Here is output from the Welch ANOVA in R:

g = c(rep(1,60), rep(2,30), rep(3,6))
oneway.test(x ~ g)

        One-way analysis of means (not assuming equal variances)

data:  x and g
F = 28.035, num df = 2.000, denom df = 15.694,
p-value = 6.6e-06

Because of the very small P-value, it is clear that there are significant differences among the group population mean. [A traditional one-factor ANOVA would have had 93 denominator df; the lower denominator df seen here is characteristic of the Welch ANOVA, especially if population variances differ.]

There are various ways to do ad hoc pairwise tests among the three groups. In this case Group 3 is significantly different from the other two. With Bonferroni protection against false discovery of differences, we should declare differences between pairs of treatments, if P-values are below $0.05/3 \approx 0.017.$

t.test(x1, x2)$p.val;  t.test(x1, x3)$p.val;  t.test(x2, x3)$p.val
[1] 0.1394103      # no sig dif btw Gps 1 & 2
[1] 8.800413e-05   # Gps 1 & 3 differ
[1] 0.0001181582   # Gps 2 & 3 differ

This page discusses the oneway.test in context.

Demetri Pananos · Answer 2 · 2019-06-26T03:59:24.837

1

You've done what I think most people would suggest doing. That you have a small sample size is not really something you need to combat; you can either adequately power your experiments or you can deal with the consequences (unless you are prepared to take a Bayesian approach). Anything else is, in my opinion, fishing for a rejection of the null on shakey ground.

edited Jun 26 '19 at 03:59

answered Jun 26 '19 at 03:15

Demetri Pananos

24,380
1
36
94

1

The issue isn't just that the sample sizes differ. One is wise to question the wisdom of 'what most people would do' when one of the smaller samples has a noticeably larger variance than the others. – BruceET Jun 26 '19 at 16:06

How to deal with drastically different sample sizes (3 treatments)

2 Answers2