Beyond a Welch's t-test

Question

I have two groups of species: the groups have different numbers of species.

For each species, a different number of individuals have been screened for a gene.

I want to describe the difference for the gene between the two groups.

I did a Welch's t-test, as this deals with the different numbers of species within the groups, but it is the different number of individuals of each species that concerns me.

Any tips or ideas on what I can do to overcome this pooling issue?

What are the raw data? Presence or absence of gene in an individual in a species? — Nick Cox, Sep 19 '14 at 11:52

score 2 · Answer 1 · answered Sep 19 '14 at 13:03

It seems that perhaps you want a model that better captures the stratification in your data. You might want a multi-level model. This would capture that the gene is not only between these two groups of species but that there are different species in each group. An example of such a model expressed in R's lmer function would be:

lmer( presence ~ group + (1|species), family = 'binomial' )

In this case you're looking for a fixed effect of group while recognizing that there is an additional random effect of species.

(BTW, you were testing a proportion with a t-test, which is a no-no for many reasons including, you know the variances are unequal, the data may not be normally distributed, and there are better solutions that correctly handle binomial data like logistic regression, binomial tests, binomial confidence intervals.)

score 0 · Answer 2 · edited Apr 13 '17 at 12:44

0

I recommend you this page which already discussed the argument giving a clear explanation How should one interpret the comparison of means from different sample sizes?. The only thing I would pay attention to is that different species within the same group do not have effect in your study. Hope this helps.

edited Apr 13 '17 at 12:44

Community

1

answered Sep 19 '14 at 12:26

RDGuida

203
1
3
7

Beyond a Welch's t-test

2 Answers2