0

I am trying to find the sample size required in order to establish a tolerance interval which contain 99% of the population, with 95% confidence. The results from Minitab for a sample size of 473 are presented below:

enter image description here

Since I don't know the distribution of my data, I am interested in the Nonparametric Method and Achieved Confidence columns. We have achieved a good level of confidence (95%) with n=473. The results below are achieved with n=1000:

enter image description here

The achieved confidence level is now 97.1%, which is better than the previous one. This result is understandable to me, because I expect to have a better estimation when I have a larger sample size. But when I repeat the procedure with an even larger sample size, the achieved confidence starts to reduce:

enter image description here

Here are my 2 questions:

  • What is the difference between the achieved confidence in the results, and the 95% confidence level that I select before running the calculation?
  • Why the achieved confidence increases when the sample size increases until a certain value? After that it starts to decrease (it seems that this achieved confidence is convergent toward the pre-selected confidence, which is 95%).
  • You are drawing a false generalization: the "achieved confidence" bounces around (a lot) as the sample size is increased. The only guarantees are (1) it always exceeds 95% and (2) as the sample size increases without bound, the "achieved confidence" will approach 95% asymptotically. It would be nice to report the actual order statistics the software uses: does Minitab tell you that? – whuber Feb 21 '21 at 19:25
  • @whuber, thank you for the answer. I have tried to find the statistical formula used by Minitab but I couldn't find it. So how should I understand the difference between the 2 confidences? (the theoretical one and the achieved one). Also when n is small, the achieved confidence is smaller than 95% (e.g 91% when n=400) – holy_spirit Feb 21 '21 at 19:33
  • When $n$ is too small, it's impossible to achieve the nominal confidence level. A good resource is *Statistical Intervals* by Hahn & Meeker. I give an account of theory and derive the formula at https://stats.stackexchange.com/a/166839/919 (discovered by searching our site for [nonparametric tolerance interval](https://stats.stackexchange.com/search?q=nonparametric+tolerance+interval)). – whuber Feb 21 '21 at 23:00

0 Answers0