I measured a binary variable from two different populations, and now I'm trying to find out whether the different populations differ with regards to this variable. I could use a Chi-Square test, but that would necessitate that both populations have the same length. Is there an appropriate test for these circumstances? Thank you.
-
Are you sure the Chi² requires the same size ? (I do not clearly remember the Chi²). But anyway, you can use Fisher's exact test. – Stéphane Laurent Mar 26 '12 at 18:49
-
If you have a decent sample size you can use the normal approximation to the binomial: http://en.wikipedia.org/wiki/Binomial_distribution#Normal_approximation and use a $z$-test – Macro Apr 05 '12 at 13:47
4 Answers
Chi Square doesn't require equal size groups. In R you can use either prop.test() or chisq.test().
I do this often with A/B direct mail tests with unequal size groups. For example, 100K donors are split 90% and 10%: the 90% are sent an email appeal, and 10% are sent nothing. The binary outcome is whether they donated to the appeal.
The nice thing about prop.test vs chisq.test is that prop.test will both calculate the p-value of the hypothesis that the groups are equal and calculate the confidence interval for the difference
This page gives an example of prop.test() with two groups: http://cran.r-project.org/doc/contrib/Lemon-kickstart/kr_prop.html
sexsmoke<-matrix(c(70,120,65,140),ncol=2,byrow=T)
rownames(sexsmoke)<-c("male","female")
colnames(sexsmoke)<-c("smoke","nosmoke")
prop.test(sexsmoke)

- 656
- 5
- 11
You can do a two sample t-test, perhaps after transforming the proportions using e.g. the arcsine transformation.

- 94,055
- 35
- 143
- 276
-
2... "perhaps after transforming" because the $\arcsin$ transformation is a variance stabilizing transform enhancing precision, as the variance depends on the only parameter of the Bernoulli distribution. – Horst Grünbusch May 23 '15 at 10:41
You can actually use logistic regression / glm with the outcome as dependent variable and group belonging as explanatory factor variable.

- 151
- 3
Weighting by sample size is built into how the expected vaues are computed. The only thing to worry about are the rules about how small an expected value can be.

- 132,789
- 81
- 357
- 650

- 59
- 4
-
1Welcome to the site,@JamalMunshi. This is a little more of a comment than an answer, would you mind expanding it a bit? – gung - Reinstate Monica Nov 14 '14 at 09:34
-
maybe i did not understand the question very well as i am unsure what is meant by the length of a population. – Jamal Munshi Nov 14 '14 at 12:17