p-value for two small Bernoulli trials

Question

I am no statistician, but I am among the more statistically inclined at my workplace, so my coworkers consult me from time to time. I don't alway give the right answer, but I'd like to think I give reasonable answer most of the time. This time, I couldn't find exactly what I was looking for on the internet, so I come here for help.

At our laboratory, they have received 23 samples and tested for a pathogen. 16 came out positive. In the general population, they would expect the prevalence to be about 10%, so this is rather high. They want to be sure, thoguh, and therefore want to design a control study. The main issue I was asked about was the control sample size. They didn't want a large control size, so I was asked to look into, for different control sizes and different control results, how strong would the result be.

So the first thing I found was the test statistic $$ z = \frac{\hat p_1 - \hat p_2}{\sqrt{\hat p(1-\hat p)\left(\frac1{n_1}+ \frac1{n_2}\right)}} $$ but it came with the stipulation that both trials have more than five successes and more than five failures. And if we expect a control prevalence of 10%, that means we should have at least 50 controls. That's more than the lab guys expect.

And I couldn't find any info on the internet to help me with the small sample size. So, this is what sounds reasonable to me:

Let the null hypothesis be that the control samples and the case samples come from the same Bernoulli distribution. Let the control sample size be $n$, and the control results be $k$ positive. Then we have $23+n$ samples from a Bernoulli trial, and the first $23$ of them had $16$ of the $16+k$ positive results.

The probability that at least $16$ of the positive results happen to appear among the first $23$ samples is $$ \sum_{j = 0}^{k}\frac{\displaystyle\binom{23}{16+j}\cdot \binom{n}{k-j}}{\displaystyle\binom{23+n}{16+k}} $$

To a non-statistician like me, that looks like a (one-tailed) $p$-value. And as the numbers are small, it's rather easy to compute.

I am pretty certain that I have missed something, because this is not my area of expertise. The fact that this is one-tailed is one of those things. Is this approach salvageable, or am I completely off the mark?

Seeing 16 positive tests out of 23 gives you an estimate of $P(\text{Pos Test})$ in the population of tested subjects. That is _not_ directly comparable to $P(\text{Has Disease})$ (disease prevalence) in that population. If you know the sensitivity and specificity of the test, you might be able to estimate the prevalence. — BruceET, Sep 12 '19 at 21:22
@BruceET You're right. I was careless with my words there. What if we, for the purposes of this question, and for simplicity, assume perfect sensitivity and specificity? — Arthur, Sep 13 '19 at 04:39
Possibly a massively unrealistic assumption. Seems like it might be more of a disconnect than being 'careless with words'. But if you're sure, then (according to the first displayed equation in my Answer), prevalence is the same as the proportion testing positive. If you are comparing two sample proportions with small numbers of trials, I suggest you look at Fisher's exact test. Your last equation looks as if you may be approaching that. // Suggest you think it through and ask a more precise question when you've done that. — BruceET, Sep 13 '19 at 05:40
@BruceET Well, Fisher's exact test seems to be exactly the type of thing I'm looking for. That's as one-tailed as mine, and while it doesn't give the exact same numbers, it's close. So it seems like I was onto something. And yeah, for actual publication or something like that, one will have to look into specificity and sensitivity, of course. In this case it was meant as preliminary calculations to figure out if one could reasonably get away with 20 samples, or if one needed 60. So while I don't want to be _wrong_ wrong, I'm OK with cutting some corners on something I'm just throwing together. — Arthur, Sep 13 '19 at 12:58

BruceET · Accepted Answer · 2019-09-12T22:21:51.560

Define $\pi =P(\text{Dis})$ as prevalence, and let $\tau = P(\text{Pos}).$

Also define $\eta = P(\text{Pos}|\text{Dis})$ as sensitivity and $\theta =P(\text{Neg}|\text{No Dis})$ as specificity.

Then $\tau = \pi\eta + (1-\pi)\theta.$ And with a little algebra, one obtains $$\pi = \frac{\tau+\theta - 1}{\eta + \theta - 1}.$$

So if $\tau$ is estimated by $t = \frac{\text{Nr. testing Pos}}{\text{Nr. tested}},$ then $\pi$ may be estimated by $$p = \frac{t+\theta - 1}{\eta + \theta - 1}.$$

One can obtain a binomial confidence interval (CI) for $\tau$ and use the second displayed equation to transform endpoints of the CI for $\tau$ to get a CI for $\pi$ (of the tested population).

In some cases, the estimate $p$ of prevalence can fall outside of $(0, 1).$ Then one can use a Bayesian approach and perhaps Gibbs sampling to get a useful interval estimate for $p.$

This will definitely be important if I'm asked about the actual results and need to take into account the fact that the tests aren't perfect. Thank you. — Arthur, Sep 13 '19 at 13:10

p-value for two small Bernoulli trials

1 Answers1

Linked