10

Two-sample comparison of proportions, sample size estimation: R vs Stata

I got different results for sample sizes, as follows:

In R

power.prop.test(p1 = 0.70, p2 = 0.85, power = 0.90, sig.level = 0.05)

Result: $n = 160.7777$ (so 161) for each group.

In Stata

sampsi 0.70 0.85, power(0.90) alpha(0.05)

Result: $n = 174$ for each group.

Why the difference? Thanks.

BTW, I ran the same sample size calculation in SAS JMP, the result: $n = 160$ (almost the same as the R result).

dwstu
  • 529
  • 5
  • 13

1 Answers1

7

The difference is due to the fact that Stata's sampsi command (deprecated as of Stata 13 and replaced by power) uses the continuity correction by default, whereas R's power.prop.test() does not (for details on the formula used by Stata, see [PSS] power twoproportions). This can be changed with the nocontinuity option, e.g.,

sampsi 0.70 0.85, power(0.90) alpha(0.05) nocontinuity

which yields a sample size of 161 per group. Use of the continuity correction yields a more conservative test (i.e., larger sample size), and obviously matters less as the sample size increases.

Frank Harrell, in the documentation for bpower (part of his Hmisc package), points out that the formula without the continuity correction is pretty accurate, thereby providing some justification for forgoing the correction.

Phil Schumm
  • 651
  • 4
  • 5
  • 2
    Great answer. It seems that not the difference between the two methods in my post is the cause of the difference but the fact that one of these methods is using continuity correction and the other is not. – Michael M Dec 31 '13 at 17:51
  • 1
    Thanks. With only two proportions (i.e., a 2x2 table), it doesn't matter whether you specify the alternative as two proportions or one proportion and an odds ratio. And since Fisher's Exact Test is conservative for the two-sample binomial problem, power estimates based on this are closer to those from the continuity-corrected formula. – Phil Schumm Dec 31 '13 at 22:19
  • 1
    Thanks @pschumm. I tried [Hmisc](http://www.rdocumentation.org/packages/Hmisc/functions/bpower) package's `bsamsize(0.70, 0.85, alpha=0.05, power=0.90)` and got $n_1=n_2=160.7777$. – dwstu Jan 01 '14 at 04:25
  • Anyone know what package can be used that uses the same formula's as the Hmisc package of R? – Amonet Sep 28 '20 at 16:05