is the z-test for difference of proportions valid for massive samples with tiny proportions?

Question

Let's say I want to run a difference of proportions test where each side has n=23,000 but their proportions are 0.21% and 0.34%.

     group1  group2
n     23000   23000
x        50      78
prop  0.21%   0.34%

both n(p) > 50 & n(1-p) > 50

A standard z-score test will say this difference is significant.

However, my intuition tells me the test should not work for such small proportions. If the true proportions were equal, and with such a rare event, I would actually expect to see large differences like this just from sampling variability. Am I right in thinking this? Does the difference of proportions test break down for tiny proportions?

Note: This is a purely hypothetical question. In real life, I don't care that group2 outperformed group1. The event rate is so low that there is little value in using it. In other words, it is statistically significant but not clinically significant.

Please, check your question for lapses, it is misty a bit. You say each group is 26000 but then print 23000. 0.33 and 0.39 also subsequently change, suddenly. Are these values percents or proportions, after all? 0.33 percent isn't tiny at all, is it? — ttnphns, Mar 06 '13 at 19:05
@ttnphns sorry, typo on my part, i copied the wrong line from excel. the numbers are quoted as percentages, 50/23000 = 0.21% = 0.0021. — dan, Mar 06 '13 at 19:43
You are right in your caution. Z-test for proportions is in effect a 2x2 table chi-square test, which, as you might know, is erroneous in case of very low (or high) proportions. Fortunately, there exist _exact tests_ for such a case, for example Fisher exact test for 2x2 table. — ttnphns, Mar 06 '13 at 19:56
Your *intuition* should focus on counts rather than proportions. The counts themselves will have Poisson distributions with expectations around $0.21/100 \times 23000 = 48.3$. Yes, those distributions are a little skewed, but not badly: a Z-test won't be too far wrong. — whuber, Mar 06 '13 at 21:53
@whuber I'm not sure I follow, could you elaborate or point me in the right direction? — dan, Mar 07 '13 at 02:38
It sounds like you might want to learn more about Poisson distributions. Searching this site will turn up a lot :-). — whuber, Mar 07 '13 at 08:16

score 7 · Answer 1 · answered Mar 06 '13 at 21:41

Whenever I have doubts about the performance of a particular method, I try to run a simulation study to examine how well the method works under similar conditions. Below is a simple example using R for the case you are describing. Note that I set the true proportions equal for the two groups and to a value that is somewhere in between what you actually observed in the two samples. Therefore, the simulation provides the empirical Type I error rate of the test. It should hopefully be close to .05. Setting the number of iterations large enough will ensure that the simulation error is small. Also, note that I once run the test without and once with Yates' continuity correction to see whether this is relevant here.

iters <- 100000

n <- 23000
p <- 0.0027

x1i <- rbinom(iters, n, p)
x2i <- rbinom(iters, n, p)

pval1 <- rep(NA, iters)
pval2 <- rep(NA, iters)

for (i in 1:iters) {
   pval1[i] <- chisq.test(matrix(c(x1i[i], n-x1i[i], x2i[i], n-x2i[i]), nrow=2, byrow=TRUE), correct=FALSE)$p.value
   pval2[i] <- chisq.test(matrix(c(x1i[i], n-x1i[i], x2i[i], n-x2i[i]), nrow=2, byrow=TRUE), correct=TRUE)$p.value
}

round(mean(pval1 <= .05), 3)
round(mean(pval2 <= .05), 3)

Here are the results from one run:

> round(mean(pval1 <= .05), 3)
[1] 0.05
> round(mean(pval2 <= .05), 3)
[1] 0.04

So, the test performs nominally when not using Yates' continuity correction. With the correction, the test is slightly conservative.

If you want to find out about the power of the test, you can set the true proportions to two different values and then rerun the simulation.

is the z-test for difference of proportions valid for massive samples with tiny proportions?

1 Answers1

Linked