Let's say I want to run a difference of proportions test where each side has n=23,000 but their proportions are 0.21% and 0.34%.
group1 group2
n 23000 23000
x 50 78
prop 0.21% 0.34%
both n(p) > 50
& n(1-p) > 50
A standard z-score test will say this difference is significant.
However, my intuition tells me the test should not work for such small proportions. If the true proportions were equal, and with such a rare event, I would actually expect to see large differences like this just from sampling variability. Am I right in thinking this? Does the difference of proportions test break down for tiny proportions?
Note: This is a purely hypothetical question. In real life, I don't care that group2 outperformed group1. The event rate is so low that there is little value in using it. In other words, it is statistically significant but not clinically significant.