Difference estimate & confidence intervals for $\chi^2$ test between 2 proportions

Question

I am using a chi-sq test between two proportions (http://statistic-on-air.blogspot.com/2009/07/comparison-of-two-proportions.html). I am using this over the z-test for proportions because I do not think my data is normal.

My question is how do I get the difference (shift) estimate & confidence intervals for this test.

For example,

Data1: 193/252=.77  
Data2: 154/227=.68

P-value=.032 (I used the formula in the link above. I will also add my exact code below)

Shift Estimate: I assume its the difference between proportions, ie .77-.68=.09. Is this correct?

Confidence Interval: ??????????????????

Code for generating P-value (matlab):

% Pooled estimate of proportion
 p0 = (n1+n2) / (N1+N2);
% Expected counts under H0 (null hypothesis)
 n10 = N1 * p0;
 n20 = N2 * p0;
% Chi-square test, by hand
 observed = [n1 N1-n1 n2 N2-n2];
 expected = [n10 N1-n10 n20 N2-n20];
 chi2stat = sum((observed-expected).^2 ./ expected);
 p = 1 - chi2cdf(chi2stat,1);
 H=0; if(p<.05), H=1; end

The $\chi^2$-test for a 2x2-table is more or less based on the same normality assumptions than the z-test in your reference, so you can use either test (note it's not the data that has to be close to normal, it's the test statistic). The z-test has the advantage to provide straightforward approximate confidence interval for the true shift, so you might go with this version to make life easier. — Michael M, Dec 03 '13 at 19:41
Thanks Micheal! Are you aware of any confidence intervals for the χ2-test? — DankMasterDan, Dec 03 '13 at 19:50
Even in a 2x2 table you can look at 6 differences in probabilities but the chisquare test provides only a single test result. So there is no direct way to get such c.i. — Michael M, Dec 04 '13 at 06:12
The Chisquare test is however directly linked to c.i. for the population version of Cramer's V and approximately, in the 2x2 case, also of the odds ratio. Both measure the strength of association between the two categorical variables behind the table. — Michael M, Dec 04 '13 at 06:19

score 9 · Answer 1 · edited Apr 13 '17 at 12:44

Just for statistically intimidated people (wait, it should be 'intimidated by statistics') that can identify themselves with this joke:

A patient asks his surgeon what the odds are of him surviving an impending operation. The doctor replies that the odds are usually 50-50. "But there is no need to worry," the doctor explains.

"The first fifty have already died.

It is meant to be a quick reference for those confused with the terminology and overlapping tests - I am not addressing confidence intervals.

COMPARING PROPORTIONS BETWEEN SAMPLES:

I will use the following toy tabulated data:

Antacid <- matrix(c(64, 178 - 64, 92, 190 - 92), nrow = 2)    
Antacid_marginals <- matrix(c(64, 178 - 64, 92, 190 - 92), nrow = 2)
    Antacid_marginals <- rbind(Antacid_marginals, margin.table(Antacid_marginals,2))
    Antacid_marginals <- cbind(Antacid_marginals, margin.table(Antacid_marginals,1))
    dimnames(Antacid_marginals) = list(Symptoms = c("Heartburn", "Normal","Totals"),
                                       Medication = c("Drug A", "Drug B", "Totals"))

This is what it looks like (with marginals):

           Medication
Symptoms    Drug A Drug B Totals
  Heartburn     64     92    156
  Normal       114     98    212
  Totals       178    190    368

So we have 368patients: 178 on Drug A, and 190 on Drug B and we try to see if there are differences in the proportion of heartburn symptoms between drug A and B, i.e. $p1 = 64/178$ vs $p2 = 92/190$.

1. FISHER EXACT TEST: There is a discussion on Wikipedia about "Controversies". Based on the hypergeometric distribution, it is probably most adequate when the expected values in any of the cells of a contingency table are below 5 - 10. The story of the RA Fisher and the tea lady is great, and can be reproduced in [R] by simply grabbing the code here. [R] seems to tolerate without a pause the large numbers in our data (no problem with factorials):

 Antacid <- matrix(c(64, 178 - 64, 92, 190 - 92), nrow = 2)
    fisher.test(Antacid, alternative = "two.sided")
    Fisher's Exact Test for Count Data
data:  Antacid
p-value = 0.02011
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.3850709 0.9277156
sample estimates:
odds ratio 
 0.5988478

2. CHI-SQUARE TEST OF HOMOGENEITY: Otherwise known as the goodness of fit Pearson's chi squared test. For larger samples (> 5 expected frequency count in each cell) the $\chi^2$ provides an approximation of the significance value. The test is based on calculating the expected frequency counts obtained by cross-multiplying the marginals (assuming normal distribution of the marginals, it makes sense that we end up with a $\chi^2$ distributed test statistic, since if $X\sim N(\mu,\sigma^)$, then $X^2\sim \chi^2(1))$:

                    Medication
Symptoms       Drug A                   Drug B               
Heartburn     156 * 178 / 368 = 75      156 * 190 / 368 = 81 
Normal        212 * 178 / 368 = 103     212 * 190 / 368 = 109

The degrees of freedom will be calculated as the {number of populations (Heartburn sufferers and Normals, i.e. 2) minus 1 } * {number of levels in the categorical variable (Drug A and Drug B, i.e. 2) minus 1}. Therefore, in a 2x2 table we are dealing with 1 d.f. And crucially, a $\chi^2$ of $1\,df$ is exactly a squared $N \sim (0,1)$ (proof here), which explains the sentence "a chi-square test for equality of two proportions is exactly the same thing as a z-test. The chi-squared distribution with one degree of freedom is just that of a normal deviate, squared. You're basically just repeating the chi-squared test on a subset of the contingency table" in this post.

The Test Statistic is calculated as:

$\chi^2=\frac{(64-75)^2}{75} + \frac{(92-81)^2}{81} +\frac{(114-103)^2}{103} + \frac{(98-109)^2}{109} = 5.39$

This is calculated in R with the function prop.test() or chisq.test(), which should yield the same result, as indicated here:

prop.test(Antacid, correct = F)

2-sample test for equality of proportions without continuity correction

data:  Antacid
X-squared = 5.8481, df = 1, p-value = 0.01559
alternative hypothesis: two.sided
95 percent confidence interval:
 -0.22976374 -0.02519514
sample estimates:
   prop 1    prop 2 
0.4102564 0.5377358

or..

chisq.test(Antacid, correct = F)

Pearson's Chi-squared test

data:  Antacid
X-squared = 5.8481, df = 1, p-value = 0.01559

3. G-TEST: The Pearson's chi-test statistic is the second order Taylor expansion around 1 of the G test; hence they tend to converge. In R:

library(DescTools) 
GTest(Antacid, correct = 'none')
Log likelihood ratio (G-test) test of
independence without correction

data:  Antacid
G = 5.8703, X-squared df = 1, p-value = 0.0154

4. Z-TEST OF PROPORTIONS: The normal distribution is a good approximation for a binomial when $np>5$ and $n(1-p)>5$. When the occurrences of successes are small in comparison with the total amount of observations, it is the actual number of expected observations that will determine if a normal approximation of a poisson process can be considered ($\lambda \geq 5$).

Although the post hyperlinked is old, I haven't found in CV an R function for it. This may be due to the fact explained above re: $\chi^2_{(df=1)}\sim \, N_{(0,1)}^2$.

The Test Statistic is:

$ \displaystyle Z = \frac{\frac{x_1}{n_1}-\frac{x_2}{n_2}}{\sqrt{p\,(1-p)(1/n_1+1/n_2)}}$ with $\displaystyle p = \frac{x_1\,+\,x_2}{n_1\,+\,n_2}$, where $x_1$ and $x_2$ are the number of "successes" (in our case, sadly, heartburn), over the number of subjects in that each one of the levels of the categorical variable (Drug A and Drug B), i.e. $n_1$ and $n_2$.

In the linked page there is an ad hoc formula. I have been toying with a spin-off with a lot of loose ends. It defaults to a two-tailed alpha value of 0.05, but can be changed, as much as it can be turned into a one tailed t = 1:

zprop = function(x1, x2, n1, n2, alpha=0.05, t = 2){
  nume = (x1/n1) - (x2/n2)
  p = (x1 + x2) / (n1 + n2)
  deno = sqrt(p * (1 - p) * (1/n1 + 1/n2))
  z = nume / deno
  print(c("Z value:",abs(round(z,4))))
  print(c("Cut-off quantile:", 
    abs(round(qnorm(alpha/t),2))))
  print(c("pvalue:", pnorm(-abs(z))))
}

In our case:

    zprop(64, 92 , 178, 190)
    [1] Z value:          2.4183  
    [1] Cut-off quantile: 1.96             
    [1] pvalue:           0.0077

Giving the same z value as the function in the R-Bloggers: z.prop(64, 178 , 92, 190) [1] -5.44273

OK... Hope it helps somebody out there, and I'm sure mistakes will be pointed out...

In skimming over this (impressive) post, I found myself wondering just what distinction you are making between "non-parametrically" and "parametrically." All tests (Fisher's Exact, $\chi^2$, and $Z$) generally apply to *non-parametric* models, with the latter two using $\chi^2$ and $Z$ distributions to *approximate* the null distribution. In no case have you actually specified a parametric model for the data, although you have alluded in passing to a "Poisson process." Clarification of this distinction would be helpful. — whuber, Aug 25 '15 at 18:43
@whuber I post a lot hoping that my amateur mistakes will be pointed out, and over time become reasonable answers for people coming from applied sciences. So thank you very much for taking the time. I think I get your comment. You are saying that I am not trying to estimate a parameter _assuming_ a distribution, which would deserve the "parametric" label... Do you think I'd be better off erasing these headings? — Antoni Parellada, Aug 25 '15 at 18:53
I'm not sure. I guess I'm suggesting they might not adequately reflect the basis of your organization of this post, that's all. I don't whether that means you would prefer to change the headings or to change the organization! — whuber, Aug 25 '15 at 21:13

Difference estimate & confidence intervals for $\chi^2$ test between 2 proportions

1 Answers1

Linked

Related