Pooled variance for post hoc tests for contingency table $\chi^{2}$ tests

Question

The 2-by-2 contingency table test provides an alternate hypothesis testing approach to the $z$ test for proportion difference. Conveniently enough, a 2-by-k contingency table test provides an omnibus test for proportion difference, akin to the one-way ANOVA's test for mean difference:

One-way ANOVA $H_{0}: \mu_{1} = \mu_{2} = \cdots = \mu_{k}$

2-by-$k$ contingency table test $H_{0}: p_{1} = p_{2} = \cdots = p_{k}$

If we reject the null hypothesis in the 2-by-$k$ contingency table test, we could proceed to conduct post hoc pairwise comparisons using the z test for proportion difference between groups $i$ and $j$, where the test statistic is given by:

$$z = \frac{\hat{p}_{i}-\hat{p}_{j}}{\sqrt{\hat{p}\left(1-\hat{p}\right)\left[\frac{1}{n_{i}}+\frac{1}{n_{j}}\right]}}$$

And where (I think) $\hat{p}$ creates the pooled estimate assuming the contingency table's null hypothesis is true (i.e. $\hat{p}$ is the total number of events divided by the total sample size across all $k$ groups).

Question 1: How to incorporate this pooled estimate into post hoc pairwise 2-by-2 contingency table tests (i.e. $\boldsymbol{\chi^{2}}$ tests)? (bonus points if you can speak to continuity corrections)

Contingency table tests are also useful for posing questions about evidence of association between two categorical variables where *both* have more than 2 categories. For an $l$-by-$k$ contingency table test, where $l>2$ and $k>2$, if we reject the null hypothesis and wish to proceed to conduct *post hoc* subgroup tests:

Question 2a: How do we incorporate the pooled variance under the $\boldsymbol{l}$-by-$\boldsymbol{k}$ contingency table test's null hypothesis for post hoc 2-by-2 table tests (i.e. $\boldsymbol{\chi^{2}}$ tests)? (bonus for continuity corrections)

Question 2b: How do we incorporate the pooled variance under the $\boldsymbol{l}$-by-$\boldsymbol{k}$ contingency table test's null hypothesis for post hoc $\boldsymbol{m}$-by-$\boldsymbol{n}$ contingency table tests (i.e. $\boldsymbol{\chi^{2}}$ tests), either $\boldsymbol{2<m\le l}$ OR $\boldsymbol{2<n\le k}$, or $\boldsymbol{2<m\le l}$ AND $\boldsymbol{2<n\le k}$, and these tests ARE disjoint (i.e. they do not overlap on the $\boldsymbol{l}$-by-$\boldsymbol{k}$ contingency table).

Question 2c: How do we incorporate the pooled variance under the $\boldsymbol{l}$-by-$\boldsymbol{k}$ contingency table test's null hypothesis for post hoc $\boldsymbol{m}$-by-$\boldsymbol{n}$ contingency table tests (i.e. $\boldsymbol{\chi^{2}}$ tests), either $\boldsymbol{2<m\le l}$ OR $\boldsymbol{2<n\le k}$, or $\boldsymbol{2<m\le l}$ AND $\boldsymbol{2<n\le k}$, and these tests ARE NOT disjoint (i.e. they do overlap on the $\boldsymbol{l}$-by-$\boldsymbol{k}$ contingency table).

[An addition by @ttnphns. This very interesting question was re-inspired, as Alexis herself noted, by this recent one.]

To my (non-expert) gut this looks like something that's difficult to do properly in frequentist setting. Would sketching a Bayesian solution for this class of problems be an acceptable answer or are you confined to the frequentist approach? — Martin Modrák, Mar 09 '18 at 19:52
@MartinModrák Frequentist. Thank you for asking. If there's *no* feasible frequentist answer, an explanation of why would be an acceptable answer for me. — Alexis, Mar 09 '18 at 20:40
I don't think it is infeasible or impossible, just that in my experience with frequentist settings, small variations in the question you ask often require deriving novel formulas which might or might not be found in some half-forgotten paper from 30 years ago :-) (you know, just the usual Bayesian propaganda) — Martin Modrák, Mar 10 '18 at 12:21
I am not sure what the problem is with *"How do we incorporate the pooled variance..."*. Does it not boil down to performing multiple ordinary z-tests for pairwise comparisons? (where I think $\hat{p}$ refers to the pooled probability for the i-th and j-th cell, since that is the hypothesis tested with that formula: $p_i = p_j$) — Sextus Empiricus, Mar 13 '18 at 11:48
What would be the hypothesis in the *'post hoc m-by-n contingency table tests'* and why do you look for some form of z-test when a chi-squared test seems more suitable? — Sextus Empiricus, Mar 13 '18 at 12:01
@MartijnWeterings I am not looking for a *z* test. In my question I point out the the unpaired *z* test for proportion difference is homologous to the 2x2 $\chi^{2}$ test. In the former one uses pooled variance, in the latter similarly one uses marginal totals. Now my question asks what to use for pooled variance (1) in *post hoc* 2-by-2 tests from a 2-by-*k* omnibus test; (2a) in *post hoc* 2-by-2 tests from an *l*-by-*k* omnibus test; (2b and 2c) in *m*-by-*n* *post hoc* tests from an *l*-by-*k* omnibus test (disjoint and not disjoint, respectively). — Alexis, Mar 13 '18 at 17:09
@MartijnWeterings $H_{0}:$ there is no association between variable $x$ having $m$ categories, and variable $y$ having $l$ categories. $H_{A}:$ there **is** such an association. I know how to perform this $\chi^{2}$ test **by itself**. However, a prior omnibus test implies a more precise pooled variance estimate (if you are familiar with frequentist ANOVA, Kruskal-Wallis, I mean "pooled variance" for the *post hoc* tests in precisely the same way): my question is how to obtain the pooled variance. — Alexis, Mar 13 '18 at 17:16
Mathematically, the _posthoc_ z-test (for independent proportions) is the same as the _apriori_ z-test. (Which is different from the ANOVA/t-test situation, where mean and variance of a variate are not "one and the same" thing.) The difference is not in math in this case but only in how you apply the test. Whether you value the omnibus (chi-square) H0 or disregard it as irrelevant. And whether you test (compare) all groups or a subset of them. That's how I'm used to think; I can be wrong. — ttnphns, Oct 13 '20 at 05:12
`Question 1: How to incorporate this pooled estimate into post hoc pairwise 2-by-2...` May I ask you a Q too? When you are doing a post hoc t-test, are you using the pooled estimate of the $k$ groups' _mean_ somewhere in its formula? Proportion is also a mean. — ttnphns, Oct 13 '20 at 06:02
@ttnphns Yes, the mean appears in the pooled sample variance. But, as I wrote elsewhere, you are mistaken about the variance *post hoc* & *a priori* being the same: in an omnibus testing situation, proportion tests have a *more reliable estimate of the proportion under the null hypothesis when **data from all groups are used***. This more reliable estimate of the mean (proportion) is the basis for pooled variance, *not* the estimated proportion under the null from only two groups. *Post hoc* tests—my question's subject—are *not* independent of the omnibus tests they follow. — Alexis, Oct 13 '20 at 15:11
I don't understand what you mean by *the* "pooled variance," because the variance of $\hat p_i-\hat p_j$ varies with $i$ and $j.$ Intuitively, "pooling" the variances of all the proportion estimates ought to lead to an inadmissible procedure. Consider, as an extreme example, a table dominated by a huge count in one cell, implying its proportion will be precisely estimated but the other proportions will not be. What would a "pooled variance" be in such a situation and why would it even be relevant to comparing a difference between two of the lightly populated cells? — whuber, Oct 13 '20 at 16:12
Maybe what you're looking for could be satisfied with a GLM of the table instead of a chi-squared analysis. — whuber, Oct 13 '20 at 16:12
Alexis. Here is one spot where you **possibly could see an answer** or something. Do you have a package doing Mood's median test _with_ subsequent post hoc pairwise comparisons? The test is actually the 2xk table Pearson Chi-square test (maybe with Yates correction, I don't remember); the two row categories by default being "<= total median" vs "> it", but you may specify an arbitrary value as the cut-point, so you set the two categories to the two categories of your response variable. — ttnphns, Oct 13 '20 at 16:51
(cont.) Thus, you've got the usual omnibus Chi-square _plus_ the post-hoc following it (if it is significant). Inspect this and see if there will be any difference in results with the usual pairwise z-test. In SPSS (which I work with most of the time), the algorithms doc says: pairwise comparison "perform the median test using data only consisting of sample i and sample j as if other samples don’t exist". In other words, it just does usual z-test (aka 2x2 chi-sq), without any pooling of variance. — ttnphns, Oct 13 '20 at 17:07
@whuber "why would it even be relevant to comparing a difference between two of the lightly populated cells?" Because if the omnibus null hypothesis is true, then the most reliable estimate of $\hat{p}$ results from treating all $k$ groups are coming from a single sample, and therefore same for the estimate of the variance and SE—that is what I am referring to as "pooling". As to different sample sizes, in the oneway ANOVA case, the *post hoc* test weights the pooled SE accordingly, so I would imagine something like that may apply also here? — Alexis, Oct 13 '20 at 17:44
@ttnphns Thank you for the recommendation about Mood's test! I will investigate and report back. :) — Alexis, Oct 13 '20 at 17:45
I'm lost at what your "$\hat p$" refers to in a $2\times k$ table. And regardless, the variances of various *estimates* of probabilities related to any such table are typically going to be very different, leading us back to my original question: just what would some "pooled variance" refer to that is relevant to any hypothesis you want to test? — whuber, Oct 13 '20 at 17:52
1/2 @whuber I am somewhat at a loss to communicate this more clearly: $\hat{p}$ is the sample proportion taking the data from all $k$ groups. Not from just from group $1$ (which has sample size $n_1$), not from group $2$ (sample size $n_2$), but estimates the proportion using all observations from all groups (i.e. the grand mean/grand proportion), with sample size $N = n_1 + n_2 + \dots n_k$. *If* the omnibus null hypothesis is true, then $\hat{p}$ is the most reliable estimator of the true population proportion in each group. — Alexis, Oct 13 '20 at 22:22
2/2 The pooled standard errors for *post hoc* pairwise tests would presumably be adjusted for differing sample sizes in a very similar fashion to the way *post hoc* unpaired *t* tests following a one way ANOVA are: $t_{ij} =\frac{\overline{x}_i - \overline{x}_j}{\sqrt{s^{2}_{\text{pooled}}\left(\frac{1}{n_i} + \frac{1}{n_j}\right)}}$. The term $s^{2}_{\text{pooled}}$ equals the within-groups sample variance $s^{2}_{\text{w}}$. My question up top is about an analog in the contingency table test world. Aside other *post hoc* tests use similar pooled variance estimates (e.g., Dunn's test, etc.) — Alexis, Oct 13 '20 at 22:28
As from _my_ p.o.v., Alexis (pardon for coming back), I can only repeat what I've said: We have no right to use the pooled variance as you wish, in post hoc test of two proportions, because, _unlike_ in the post hoc t-test, the pooled variance here is the _function of_ the grand mean (the grand prop. $\hat{p}$), and so by virtue of using the pooled variance we are forcing $\hat{p}$ through into the comparison of $\hat{p_i}$ with $\hat{p_j}$: we implicitly drag an "alien" prop. into the comparison of these two props. While in the t-test we don't involve that alien term (grand mean). — ttnphns, Oct 14 '20 at 01:06
There is a clear logical disconnect between assuming "if we reject the null hypothesis" and then asking for a pooled variance, because the latter exists only when the null hypothesis holds. Perhaps one way to make progress with this discussion, Alexis, would be to articulate (mathematically) what you mean by "reject the null:" exactly what is your model in that situation? — whuber, Oct 14 '20 at 13:30
@whuber If we reject an omnibus null hypothesis $\text{H}_0\text{: }\mu_1 = \mu_2 = \dots \mu_k$, then at least one mean differs from at least one other mean. The "pooled variance" I am (struggling) to understand in *post hoc* tests, seems to be a function of all samples, but is weighted by respective sample sizes in the calculation of *post hoc* test statistics. The "only exists when the null hypothesis holds," yes, and yet the logic of *post hoc* tests (as expressed—incorrectly?—in many introductory applied textbooks) says as long as $\text{H}_0\text{: }\mu_i = \mu_j$, use pooled variance. — Alexis, Oct 14 '20 at 15:29
@ttnphns (I appreciate your coming back! I am clearly struggling to understand a thing or three here. I welcome your patience. :) I hear that because the grand mean (grand proportion) does not appear in the within-group variance that it should not be included in a calculation of some kind of pooled variance for a *post hoc* test for proportion difference. Ok. What about an average of sample variances for Bernouli-distributed variables across $k$ groups? What about similar for categorically-distributed variables? I hear you saying there is no need for a specific *post hoc* test, yes? Why not? — Alexis, Oct 14 '20 at 15:39
You seem to be quoting textbooks that are analyzing *means* rather than counts (or, equivalently, proportions). The difference is that the counts, if their expectations differ, are automatically heteroscedastic. (That is factored into the chi-squared test, for instance). The heteroscedasticity calls into question the meaning of any pooling based on counts with hypothetically different expectations. — whuber, Oct 14 '20 at 16:06
@whuber I am indeed starting from a position of familiarity with *post hoc* tests for mean difference, or for mean rank difference, and looking at contingency table tests and wondering, and that is the motivation for my question. An acceptable answer would explain why the pooled variance is not possible or unnecessary for count data, and make the case that *post hoc* $\chi^{2}$ tests between 2 groups have an independence from the omnibus ($k$ group) $\chi^{2}$ test that is different than for analogous tests for mean difference in continuous data. (Also: Gosh! Thank you for *your* patience!) — Alexis, Oct 14 '20 at 16:55
Alexis, in post hoc t test we don't _have_ to use the variance estimated from _all_ k samples. Rather, this is the most logical decision (since in post hoc, the H0 sounds as "these two groups don't differ, as the rest, too"), but isn't the only permissible decision; and we might decide instead to use the variance pooled only from the groups i and j, or even from some other groups of k. We may do these poolings because (1) all groups are considered one population in the H0, _and_ because the variance can be seen as independent quantity from _the two means_ being currently compared. — ttnphns, Oct 14 '20 at 17:20
(cont.)... and so no contamination by other _means_ occur when the means i an j are compared by the pairwise post hoc t test. — ttnphns, Oct 14 '20 at 17:36
@ttnphns That is *really* helpful.. and is starting to land with me. Please note the acceptability of an answer in the negative in my most recent response to whuber. — Alexis, Oct 14 '20 at 20:51

Pooled variance for post hoc tests for contingency table $\chi^{2}$ tests

0 Answers0

Linked