Comparing two binary variables of unequal sizes

Question

I measured a binary variable from two different populations, and now I'm trying to find out whether the different populations differ with regards to this variable. I could use a Chi-Square test, but that would necessitate that both populations have the same length. Is there an appropriate test for these circumstances? Thank you.

Are you sure the Chi² requires the same size ? (I do not clearly remember the Chi²). But anyway, you can use Fisher's exact test. — Stéphane Laurent, Mar 26 '12 at 18:49
If you have a decent sample size you can use the normal approximation to the binomial: http://en.wikipedia.org/wiki/Binomial_distribution#Normal_approximation and use a $z$-test — Macro, Apr 05 '12 at 13:47

score 7 · Answer 1 · answered May 25 '12 at 17:27

Chi Square doesn't require equal size groups. In R you can use either prop.test() or chisq.test().

I do this often with A/B direct mail tests with unequal size groups. For example, 100K donors are split 90% and 10%: the 90% are sent an email appeal, and 10% are sent nothing. The binary outcome is whether they donated to the appeal.

The nice thing about prop.test vs chisq.test is that prop.test will both calculate the p-value of the hypothesis that the groups are equal and calculate the confidence interval for the difference

This page gives an example of prop.test() with two groups: http://cran.r-project.org/doc/contrib/Lemon-kickstart/kr_prop.html

sexsmoke<-matrix(c(70,120,65,140),ncol=2,byrow=T)
rownames(sexsmoke)<-c("male","female")
colnames(sexsmoke)<-c("smoke","nosmoke")
prop.test(sexsmoke)

score 1 · Answer 2 · answered Mar 26 '12 at 17:03

1

You can do a two sample t-test, perhaps after transforming the proportions using e.g. the arcsine transformation.

answered Mar 26 '12 at 17:03

Peter Flom

94,055
35
143
276

2

... "perhaps after transforming" because the $\arcsin$ transformation is a variance stabilizing transform enhancing precision, as the variance depends on the only parameter of the Bernoulli distribution. – Horst Grünbusch May 23 '15 at 10:41

score 0 · Answer 3 · answered Nov 14 '14 at 07:41

0

You can actually use logistic regression / glm with the outcome as dependent variable and group belonging as explanatory factor variable.

answered Nov 14 '14 at 07:41

Alexander Radev

151
3

score -1 · Answer 4 · edited Nov 14 '14 at 09:35

-1

Weighting by sample size is built into how the expected vaues are computed. The only thing to worry about are the rules about how small an expected value can be.

edited Nov 14 '14 at 09:35

gung - Reinstate Monica

132,789
81
357
650

answered Nov 14 '14 at 06:31

Jamal Munshi

59
4

1

Welcome to the site,@JamalMunshi. This is a little more of a comment than an answer, would you mind expanding it a bit? – gung - Reinstate Monica Nov 14 '14 at 09:34
maybe i did not understand the question very well as i am unsure what is meant by the length of a population. – Jamal Munshi Nov 14 '14 at 12:17

Comparing two binary variables of unequal sizes

4 Answers4

Linked