What distribution does Fisher's exact test assume?

Question

In my work I have seen several uses of Fisher's exact test, and I was wondering how well it fits my data. Looking at several sources I understood how to calculate the statistic, but never saw a clear and formal explanation of the assumed null hypothesis.

Can someone please explain or refer me to a formal explanation of the assumed distribution? Will be grateful for an explanation in terms of the values in the contingency table.

In the 2x2 case it's based on the hypergeometric distribution. — Glen_b, Dec 30 '14 at 09:56

Stéphane Laurent · Accepted Answer · 2015-08-07T12:20:41.983

In the $2\times 2$ case the distributional assumption is given by two independent binomial random variables $X_1 \sim Bin(n_1, \theta_1)$ and $X_2 \sim Bin(n_2, \theta_2)$. The null hypothesis is the equality $\theta_1=\theta_2$. But Fisher's exact test is a conditional test: it relies on the conditional distribution of $X_1$ given $X_1+X_2$. This distribution is a hypergeometric distribution with one unknown parameter: the odds ratio $\psi=\frac{\frac{\theta_1}{1-\theta_1}}{\frac{\theta_2}{1-\theta_2}}$, and then the null hypothesis is $\psi=1$.

This distribution has its Wikipedia page.

To evaluate it with R, you can simply use the formula defining the conditional probability:

p1 <- 7/27
p2 <- 14/70
x1 <- 7; n1 <- 27
x2 <- 14; n2 <- 56
# 
m <- x1+x2
dbinom(x1, n1, p1)*dbinom(x2, n2, p2)/sum(dbinom(0:m, n1, p1)*dbinom(m-(0:m), n2, p2))
[1] 0.1818838

Or use the dnoncenhypergeom function of the MCMCpack package:

psi <- p1/(1-p1)/(p2/(1-p2)) # this is the odds ratio
MCMCpack::dnoncenhypergeom(x=x1, n1, n2, x1+x2, psi)
[1] 0.1818838

Thank you @Stephane. Can you explain further why it becomes hypergeometric, and what the parameters are? — Amit Lavon, Dec 30 '14 at 13:42
Sorry @AmitLavon, I don't know the details about this hypergeometric distribution. — Stéphane Laurent, Dec 30 '14 at 15:10
@AmitLavon I have just edited my answer to include the link to wikipedia and the R code. — Stéphane Laurent, Aug 07 '15 at 12:21

score 10 · Answer 2 · answered Dec 30 '14 at 12:13

Fisher's so-called "exact" test makes the same kind of subtle assumptions that $\chi^2$ tests make.

The two variables being assessed for association are truly polytomous all-or-nothing variables such as dead/alive US/Europe. If one or both of the variables is a simplification of an underlying continuum, categorical data analysis should not be undertaken at all.
There are no other relevant background variables. If $Y$ is the outcome variable and $X$ is a variable being assessed for association with $Y$, the probability that $Y=y$ is identical for every subject with $X$ fixed at $x$. Contingency tables assume in effect that there is no heterogeneity in the distribution of $Y$ that is not accounted for by $X$. For example, in a randomized clinical trial studying the effect of treatment A vs. B on the probability of death, a $2\times 2$ contengency table test assumes that every subject on treatment A has the same probability of death. [One could argue that this is too stringent an assumption, but that position doesn't recognize the loss of power from doing unadjusted tests of association.]

Fisher's test makes one assumption not made by unconditional tests of association such as Pearson's $\chi^2$ test: that we are interested in the "current" marginal distribution of both $X$ and $Y$, that is, we are conditioning on the frequencies of the $Y$ outcome categories. This is not reasonable for prospective studies. The use of Fisher's test leads to conservatism. Its $P$-values are on the average too large, because the test guarantees that the $P$-values are not too small. On the average, Pearson $\chi^2$ $P$-values are more acccurate than Fisher's, even with expected frequencies far lower than 5 in some of the cells.

Thank you @FrankHarrell. Can you give references for your claim about chi-square P-values being more accurate than Fisher's? — Amit Lavon, Dec 30 '14 at 13:38
See for example http://www.citeulike.org/user/harrelfe/tag/fishers-exact-test. This has been discussed at length on stackexchange. — Frank Harrell, Dec 30 '14 at 14:21
sadly ctiteulike is gone and web.archive.org only seems to have crawled the first page of the harrelfe account. — Glen_b, Mar 17 '20 at 01:31
https://www.zotero.org/groups/2199991/feh/tags/fishers-exact-test/library — Frank Harrell, Mar 17 '20 at 19:48

What distribution does Fisher's exact test assume?

2 Answers2