Comment: Illustration of @Glen-b's Comment about the equivalence of prop.test
and chisq.test
for your data.
The two procedures give exactly the same P-value:
prop.test(c(16,5), c(111,88))
2-sample test for equality of proportions
with continuity correction
data: c(16, 5) out of c(111, 88)
X-squared = 3.0944, df = 1, p-value = 0.07856
alternative hypothesis: two.sided
95 percent confidence interval:
-0.004154855 0.178806780
sample estimates:
prop 1 prop 2
0.14414414 0.05681818
TBL = rbind(c(16,5),c(95,83))
chisq.test(TBL)
Pearson's Chi-squared test
with Yates' continuity correction
data: TBL
X-squared = 3.0944, df = 1, p-value = 0.07856
They are also the same if the continuity correction is not used.
prop.test(c(16,5), c(111,88), cor=F)$p.val
[1] 0.04643965
chisq.test(TBL, cor=F)$p.val
[1] 0.04643965
However, there are several versions of the test for equality of two binomial proportions. Some versions
use $H_0$ to argue for using a pooled sample proportion
$\hat p = \frac{x_1+x_2}{n_1+n_2}$ to get the standard error for $\hat p_1 - \hat p_2$ and some use separate
estimates $\hat p_i = x_i/n_i$ for this purpose. Also, various computer programs use different kinds of continuity corrections.
Also, if counts are too small for an accurate P-value in chisq.test
, then R allows the option to simulate a more accurate P-value. (Simulation is not supported for prop.test
.)
Finally, the Fisher Exact Test can give a different P-value than any of the above.
fisher.test(TBL)$p.val
[1] 0.06220786
Simulated p-values in chisq.test
tend to be close to the Fisher p-value, especially if you use more than the default number of
iterations to simulate.
Table with small counts:
TAB = rbind(c(40, 3), c(60, 7)); TAB
[,1] [,2]
[1,] 40 3
[2,] 60 7
chisq.test(TAB)
Pearson's Chi-squared test
with Yates' continuity correction
data: TAB
X-squared = 0.077317, df = 1, p-value = 0.781
Warning message:
In chisq.test(TAB) :
Chi-squared approximation may be incorrect
chisq.test(TAB, sim=T)
Pearson's Chi-squared test
with simulated p-value
(based on 2000 replicates)
data: TAB
X-squared = 0.38181, df = NA, p-value = 0.7276
More iterations:
chisq.test(TAB, sim=T, B = 5000)
Pearson's Chi-squared test
with simulated p-value
(based on 7000 replicates)
data: TAB
X-squared = 0.38181, df = NA, p-value = 0.744
fisher.test(TAB)
Fisher's Exact Test for Count Data
data: TAB
p-value = 0.7374
alternative hypothesis:
true odds ratio is not equal to 1
95 percent confidence interval:
0.3292961 9.8349592
sample estimates:
odds ratio
1.549608