2

A test has probability $P_f$ of failure where it is expected that $.0001 <P_f < .01$.

Given $N$ test samples where all tests succeed and there were no failures (thus $\sigma=0$), what is the best way to estimate the approximate value of $P_f$ assuming that is nonzero? What would be the most appropriate method to calculate error bars for this test?

  • Your description is not clear. What meaning have error bars in this situation? $\sigma$ does not make sense. A look at binomial tests (in Wiki) may be helpful. –  Apr 11 '19 at 21:42
  • 1
    Might be relevant: https://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval – Artem Mavrin Apr 12 '19 at 00:29
  • There is no correct answer. My simplified rule of thumb would be to use $\frac1{2N}$ with an interval of $\left[0,\frac{2.5}{N}\right]$ – Henry Apr 12 '19 at 12:53

2 Answers2

0

From the help in the comments it looks like the rule of three is applicable here where the probability of failure would be $\approx \frac3N$ with 95% confidence.

0

BEGINING OF EDIT

When I wrote this, I misread your post. I treated success as rare rather than failures. The answer is the same, however.

END OF EDIT

There is a good first principles solution to this problem. You know that $$.0001<\theta<.01.$$ It appears you have no further information regarding the content so that $\theta$ can be treated as being drawn from a uniform distribution over the interval. Since probability density functions must sum to one, $$\Pr(\theta)=10000/99,\theta\in(.0001,.01).$$

This is a binary process so the likelihood, given no other information, is the binomial distribution. For illustration purposes, let us assume you have not observed "success" in the rare event, but you have seen 500 failures. The likelihood of seeing this is $$\theta^0(1-\theta)^{500}.$$ The posterior density is $$\Pr(\theta|500 failures)=530.368(1-\theta)^{500}.$$

The highest density region is on the left side of this interval. The 95% highest density region will solve $$\int_{.0001}^a 530.368(1-\theta)^{500}=.95$$

The result is that $a=.00596166$ so your interval is $(.0001,.00596166)$.

This is not a confidence interval, it is a Bayesian credible interval. Its interpretation is not that if you repeated this same experiment, then the true value of the parameters will be in at least 95% of the intervals. Instead, it is that there is a 95% chance the true value of the parameter is in the interval.

Abstractly, your solution is, for an $\alpha$% highest density region solution, $$\text{If }\Pr(.0001<\theta<\gamma|n=N;k=0)=\alpha\text{ then}\int_{.0001}^\gamma\frac{\frac{10000}{99}(1-\theta)^{(n-k)}}{\int_{.0001}^{.01}\frac{10000}{99}(1-\theta)^{(n-k)}\mathrm{d}\theta}\mathrm{d}\theta=\alpha,$$ where $n$ is the total number of observations and $k=0$, the successes.

Because of the prior, your variance can be estimated. The variance of the example posterior is $$\frac{500}{501^2\times502}=\frac{250}{63001251}.$$

Dave Harris
  • 6,957
  • 13
  • 21