7

I have taught six sigma black belt classes using consultant sourced training materials which included the rule of thumb that when estimating the rate of occurrence of discrete events, like the proportion of defective units produced by a manufacturing process, a sample size large enough to include at least 5 defective units is desired.

For example if the defect rate is 1 in a million, you must sample at least 5,000,000 units to obtain a reasonably accurate estimate of the defect rate.

I need an authoritative reference supporting this rule of thumb.

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
  • 6
    It is hard to imagine how there could be such a reference--at least one with any credibility--because "reasonably accurate" is vague and *ought* to vary with circumstances, such as the magnitude of the consequences of out-of-control manufacturing runs. Wouldn't you be better off knowing how the expected number of defective units is related, *quantitatively*, to the accuracy of estimates based on samples? – whuber Oct 23 '14 at 17:25
  • 1
    I am trying to convince the organization that basing defect rate estimates on samples which only include 1 or 2 defects is unwise. – hugh graham Oct 24 '14 at 15:37
  • 1
    You might consider providing the associated (Poisson) confidence intervals for such estimates. Those estimates might or might not be "unwise," depending on the purpose, but by seeing some quantitative expression of uncertainty the organization will have the means to decide whether the sample sizes are adequate or not. – whuber Oct 24 '14 at 15:41
  • Could you please show me how to estimate the required sample size using the Poisson distribution, if that would be preferred? – hugh graham Oct 24 '14 at 20:54
  • 1
    Unfortunately [our thread on Poisson CIs](http://stats.stackexchange.com/questions/15371) is inadequate. A link I gave in a comment asserts that a $1-\alpha$ CI around the Poisson count $x$ is $(\chi^2_{2x}(\alpha/2)/2,\chi^2_{2x+1}(1-\alpha/2)/2)$. Use that to study the likely CIs any particular sample size would produce and choose the sample size accordingly. For instance, a 90% CI around $x=5$ is $(1.97, 10.51)$. Doubling the sample size would give a CI around $x=2\times 5=10$ of $(5.43,16.96)$ which is relatively more precise--and the likely $x$'s would be in a narrower range, too. – whuber Oct 24 '14 at 21:15

2 Answers2

4

This reference on Lean Six Sigma, p. 159, provides a formula to calculate the minimum sample size, and mentions the "$5$ defectives" rule of thumb, a formula that relates to the normal distribution, and it is perhaps more useful to the OP than my contested (see comments) reasoning. I cannot argue about the reference's level of authority though.

Also this US government websource in section "Transforming Poisson Data" mentions this rule of thumb, relating it to the normal approximation to the Poisson distribution.

But I would like to offer a specific argument which is consistent with this rule of thumb (not necessarily an optimal argument -see comments):

One should clarify what "reasonably accurate" estimate means.Taking the road of Confidence Interval, we would want to have a point estimate whose variance/standard deviation is small enough so that the associated confidence interval won't include the value zero (and hence, negative possible values also, which in our case, would be non-sensical, and would also render the point estimate "statistically insignificant").

Associating each unit produced $i$ with a Bernoulli random variable $X_i$ that takes the value $1$ if the unit is defective and $0$ if it is not, and assuming that all units have the same probability of being defective, and that each random variable is independent from all others, then we can estimate this probability of defect as

$$\hat p =\frac 1n\sum_i^nX_i $$

or, writing $n_1$ to denote the number of defective units,

$$\hat p = \frac {n_1}{n},\;\; \operatorname{\hat Var}(\hat p) = \frac {\hat p(1-\hat p)}{n} = \frac {n_1(n-n_1)}{n^3}$$

Using the normal approximation to the binomial, a $90$% Confidence Interval then will be

$$\frac {n_1}{n} \pm (z_{0.05}+0.5){\sqrt {\frac {n_1(n-n_1)}{n^3}}}= \frac {n_1}{n}\pm 2.15\frac {\sqrt {(n_1/n)(n-n_1)}}{n}$$

where to the critical value from the standard normal distribution, we have added the $0.5$ "continuity correction".

We want the $CI$ to not include negative values, so we require

$$\frac {n_1}{n} -2.15\frac {\sqrt {(n_1/n)(n-n_1)}}{n} >0 \Rightarrow n_1^2 > (2.15)^2\cdot (n_1/n)(n-n_1)$$

Manipulating, we want

$$ n_1n > (2.15)^2\cdot (n-n_1) \Rightarrow n_1 > \frac {(2.15)^2n}{n+(2.15)^2}$$

For large $n$, as will the case be, the right-hand side tends to $(2.15)^2 = 4.62$. Since $n_1$ is an integer, $n_1 > 4.62 \Rightarrow n_1 =5$.

Alecos Papadopoulos
  • 52,923
  • 5
  • 131
  • 241
  • 1
    The confidence intervals you propose are inappropriate for rare events and small samples. A better confidence interval will automatically exceed zero. A limit for "large $n$" is not applicable here, because $n_1$ must be kept constant, whence $p$ is diminishing. This is the setting for a *Poisson* distribution analysis, not a Normal distribution analysis. – whuber Oct 23 '14 at 18:30
  • @whuber I have no doubt about that (although only the "rare event" caveat seems fitting here -the OP appears to deal in any case with large samples). But irrespective of the disclaimer at the end of my question, this was more of "detective work", trying to figure out what kind of (optimal or not, sophisticated or not) reasoning could have driven the stated rule of thumb. – Alecos Papadopoulos Oct 23 '14 at 18:42
  • But if you are applying procedures inappropriate for the setting, why should anyone be convinced of what they imply? That hardly justifies calling the result "easily derivable" or "intuitive"! You are not rescued by large samples (and the implicit appeal to normality via the CLT), precisely because the events are so rare. – whuber Oct 23 '14 at 18:44
  • 1
    @whuber Are they "inappropriate" or just "sub-optimal"? – Alecos Papadopoulos Oct 23 '14 at 18:48
  • 2
    I think there's no question of appropriateness. The suboptimality issue is a matter of quantifying how erroneous your results might become as a result of using Normal-theory calculations for Poisson distributions. Take a look at Poisson distributions with intensities less than $5$ and decide whether they look sufficiently Normal to trust your analysis. They very well might be--but taking this step is necessary to support your answer. – whuber Oct 23 '14 at 19:48
  • Could you please show me how to estimate the required sample size using the Poisson distribution, if that would be preferred? – hugh graham Oct 24 '14 at 17:16
  • @hughgraham, I would calculate your required N using the Binomial distribution, since you would know how many total units you have & for each unit whether it is defective or not. There exists free software (such as G*Power) for helping you with this task. You can find some information [here](http://stats.stackexchange.com/q/63391/7290). – gung - Reinstate Monica Oct 25 '14 at 23:46
3

I think this comes from the rule of thumb for using the normal approximation for a confidence interval (cf. @AlecosPapadopoulos' answer). In short, it is recommended that [the smaller of] $np$ [or $n(1-p)$] be greater than $5$ for the normal approximation to be used. If this condition holds, it is often considered that the normal approximation can be used, and tests that implicitly rely on it can be used instead of exact tests*.


This does not guarantee that you will see a defect, but seems to work pretty well. We can determine the probability you will see a defect by using the binomial distribution's cumulative distribution function (CDF). Here I use R to do so:

1-pbinom(0, size=5e+01, prob=1e-01)  # [1] 0.9948462
1-pbinom(0, size=5e+02, prob=1e-02)  # [1] 0.9934295
1-pbinom(0, size=5e+03, prob=1e-03)  # [1] 0.9932789
1-pbinom(0, size=5e+04, prob=1e-04)  # [1] 0.9932637
1-pbinom(0, size=5e+05, prob=1e-05)  # [1] 0.9932622
1-pbinom(0, size=5e+06, prob=1e-06)  # [1] 0.9932621
1-pbinom(0, size=5e+07, prob=1e-07)  # [1] 0.9932621
1-pbinom(0, size=5e+08, prob=1e-08)  # [1] 0.9932621
1-pbinom(0, size=5e+09, prob=1e-09)  # [1] 0.9932621
1-pbinom(0, size=5e+10, prob=1e-10)  # [1] 0.9932621

Since we want the probability of getting anything but $0$ defects, we calculate the probability of getting exactly $0$ and subtract that from $1$. The analytical solutions are listed to the right. They seem to converge to $\approx 99.33\%$ as the defect probability decreases (and $N$ necessarily goes up).


Some meta commentary: A standard paradigm contrasts accuracy vs. precision. Because common statistical procedures privilege unbiasedness (cf. maximum likelihood vs. shrinkage estimators), the process of determining the appropriate sample size (power analysis) tends to focus on getting an acceptable level of precision. In your case, this seems to be whether an interval estimate (e.g., a 95% CI) will have an accurate level of coverage.

* Technically, since the rule of thumb is $>5$, and not $\ge 5$, this would be based on an analogy to the lower limit of the rule of thumb, but then rules of thumb by their nature shouldn't be taken as so exacting.

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
  • 1
    The question concerns estimating the rate of defects rather than making sure that a defect is observed. Why not, then, take it at face value and examine the variance of an estimator of the rate? When the expected number is $5$, the variance of any reasonable estimator will be approximately $5$, making two standard deviations (a rough "error bar") equal to almost $100\%$ of the estimate. If that's OK for one's application, then this rule of thumb may be OK. But if one would like to obtain a more accurate estimate, then clearly far more items have to be sampled. BTW, $99.32605\%= 1-e^{-5}$. – whuber Oct 23 '14 at 20:52
  • 1
    @whuber, I'm not sure I follow you. Using `qbinom`, I find that an approximate 95% CI for 5 defects, w/ N=5e+06 & p=1e-06 is [1, 10]. I recognize that the point wasn't to see if a defect showed up; I just threw that out b/c I thought it would be interesting. I am inferring that the reasoning behind this rule was to ensure the accuracy of common approximation techniques--like both you & Alecos, I have no idea what "reasonably accurate" means. – gung - Reinstate Monica Oct 23 '14 at 21:11
  • 1
    BTW, I'm not advocating 6sigma's recommendation. This is just a guess regarding where their rationale may have come from. – gung - Reinstate Monica Oct 23 '14 at 21:16
  • I'm unsure how you have obtained a CI without any data. I am guessing you are reporting a central 95% interval for a Poisson$(5)$ distribution. What it shows is that in practice one will likely observe between $1$ and $10$ defects when $5$ is the true expectation. The CIs associated with such observations will vary (a lot): it is worthwhile to contemplate what they might look like. First, though, you would have to propose a CI procedure. (I wouldn't use a Normal theory one, but even that could provide some insight.) – whuber Oct 24 '14 at 00:11