5

I want to compare the means of X between BMI groups, but have unequal sample sizes between my groups.

group A (n=20, mean=16.2);
group B (n=90, mean=12.8);
group C (n=30, mean=10.8);
group D (n=8,  mean=11.2)
  1. Can I use a one way ANOVA test when the sample sizes are extremely unequal?
  2. I performed a one way ANOVA test. The results showed significant differences for A-B and A-C, but not A-D. However when I did paired t-test for A-D, the result was significant. Which part am I doing wrong, the one way ANOVA or the t-test? Is an ANOVA test valid in this case?
gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
natsya
  • 53
  • 1
  • 1
  • 3

1 Answers1

5

You have a total N = 148, distributed into 4 groups. If you had 37 in each group instead, you would have greater statistical power. Otherwise, a one-way ANOVA is just as valid here as anywhere else (given that the normal assumptions are met). (To understand this better, it may help to read my answer here: How should one interpret the comparison of means from different sample sizes?) So to answer 1. explicitly, yes, you can use a one-way ANOVA when the sample sizes are extremely unequal.

However, your description in 2. seems odd to me, so let me add a few notes:

  • If the groups (A through D) were formed by categorizing BMI (a continuous variable), you would be better off using regression with BMI as your predictor; categorizing continuous variables is not a good thing to do.
  • It isn't clear what you mean when you say that A-B and A-C were significant, but A-D wasn't. An ANOVA doesn't tell you that. An ANOVA only tells you if there is a difference somewhere amongst your groups. Did you run some post-hoc test to get those results?
  • I don't see how you could have run a paired t-test to compare A and D when they do not have the same ns. Did you mean an unpaired t-test? Under the assumption that you used some proper test for post-hoc comparisons with the ANOVA, that was probably the appropriate option as a t-test would not take into account that you have multiple comparisons, for example.
gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
  • I forgot to mention that I did post-hoc test (Tukey) after that. – natsya Mar 29 '15 at 03:33
  • I wondered if that was the case. The Tukey test is going to be a better approach than the t-tests. – gung - Reinstate Monica Mar 29 '15 at 03:48
  • Thanks for the clear explanation and helpful link. By the way my mistake, I mentioned paired t-test, it's actually independent t-test..sorry. I did t-test bcz I think that the results of group A & D from ANOVA was a bit strange, since the mean for D is around the value for B and C, I suppose it's also significant. So since t-test result for A & D is significant, but ANOVA wasn't..I wonder is it possible that we'll get different result from ANOVA and t-test because of the sample size? Thanks again. – natsya Mar 29 '15 at 04:06
  • Yes, @natsya, the post-hoc test has lower power due to the smaller sample size. W/ less data there is more uncertainty about the true value of the mean of D, so it's less significant. – gung - Reinstate Monica Mar 29 '15 at 04:17
  • 1
    BMI data is usually highly right skewed. With such few data points in each group, you might want to consider a non-parametric test since your normality assumptions are likely not met. – StatsStudent Mar 29 '15 at 06:49
  • @StatsStudent, from the description, I think BMI is the *predictor* (even though the OP referred to the response as "X", which is unusual). – gung - Reinstate Monica Mar 29 '15 at 13:18
  • @gung, thanks. I missed that in the OP. Of course, you are right. – StatsStudent Mar 29 '15 at 20:14