Tolerance bound for a normalized variable

Question

Based on a sample $(x_i) \sim_{\text{iid}} {\cal N}(\mu, \sigma^2)$, how can you get an exact or a well-approximated upper tolerance bound (i.e. an upper confidence bound of a quantile of the distribution) of $\dfrac{x_i}{\sigma}$ ? Here, we assume both parameters $\mu$ and $\sigma^2$ are unknown.

You can calculate likelihood-confidence intervals of this quantity by using that $x_i/\sigma \sim N(\mu,1)$. Using this, you have that the $p$th quantile of $x_i/\sigma$ is $q=\mu+\Phi^{-1}(p)$, then you can reparameterise the likelihood of $(\mu,\sigma)$ in terms of $(q,\sigma)$ and to obtain the profile likelihood of $q$ and the corresponding approximate confidence interval by the usual methods. There must be an exact approach but I do not know it. — , Jan 03 '13 at 11:34
Thanks @Procrastinator. As you can see I answer my own question. Starting a bounty when you have a cold is not a good idea. — Stéphane Laurent, Jan 03 '13 at 11:41

score 7 · Answer 1 · edited Jan 10 '13 at 14:18

7

Damn! After starting the bounty I realized the answer is easy: The quantile has form $\frac{\mu}{\sigma}+z_p$, and there are some known methods to get a confidence interval about the "effect size" $\frac{\mu}{\sigma}$ (or by inverting the bounds of a confidence interval about the coefficient of variation; such a confidence interval is available in the R package MBESS).

edited Jan 10 '13 at 14:18

cardinal

24,973
8
94
128

answered Jan 03 '13 at 11:36

Stéphane Laurent

17,425
5
59
101

score 5 · Answer 2 · edited Jun 11 '20 at 14:32

In Environmental Statistics with S-Plus (CRC Press, 2001), Steven Millard and Nagaraj Neerchal state that a

$\beta$-content tolerance interval with confidence level $(1-\alpha)100$% is constructed so that it contains at least $\beta 100$% of the population (i.e., the coverage is at least $\beta 100$%) with probability $(1-\alpha)100$%.

Citing several sources (Wald & Wolfowitz 1946, Guttman 1970, Gibbons 1994), they give an approximate two-sided tolerance interval for $n$ observations in the usual form of (sample mean) + $K$ (sample standard deviation) where

$$K = r \sqrt{\frac{n-1}{\chi^2_{n-1, \alpha}}}.$$

Here, $\chi^2_{n-1,\alpha}$ is the $\alpha$ quantile of a chi-square distribution with $n-1$ degrees of freedom and $r$ solves the equation

$$\Phi(\frac{1}{\sqrt{n}} + r) - \Phi(\frac{1}{\sqrt{n}} - r) -\beta$$

(with $\Phi$ the standard Normal CDF).

Wald and Wolfowitz (1946) show that this approximation is quite good, even for values of $n$ as small as $2$, provided both $\beta$ and $1-\alpha$ are greater than $0.95$. Furthermore, Ellison (1964) shows that for this approximation, the error in the confidence level is on the order of $1/n$.

Millard & Neerchal also discuss the simpler case of a one-sided tolerance interval, where $K$ is given by

$$K = \frac{t_{n-1, z_\beta \sqrt{n}, 1-\alpha}}{\sqrt{n}}.$$

The notation $t_{\nu, \delta, p}$ refers to the $p^\text{th}$ quantile of the non-central Student t distribution with $\nu$ degrees of freedom and noncentrality parameter $\delta$, and $z_p$ refers to the $p^\text{th}$ quantile of the standard Normal distribution, $z_p = \Phi^{-1}(p)$. This directly answers the present question.

Software

I provide an Excel (VBA) macro for these (Normal-theory) intervals in a spreadsheet at http://www.quantdec.com/envstats/software/intervals.xls. Although I have not tried it, the R package tolerance appears to offer a broad suite of tolerance intervals (these formulas need to be modified to apply to regression residuals, for instance).

The non-central Student t distribution is available via the ncp argument through the qt function (and its relatives) found in the base R installation.

References

Wald, A., and J. Wolfowitz. (1946). Tolerance Limits for a Normal Distribution. Annals of Mathematical Statistics 17, 208-215.

Guttman, I. (1970) Statistical Tolerance Regions: Classical and Bayesian. Hafner Publishing Co., Darien, CT.

(I omit the Gibbons reference because it has typographical errors and adds nothing fundamentally different.)

Thanks for providing these informations. But I don't see how you apply the classical theory for a Gaussian sample to the case of my question. Do you claim you divide the classical tolerance bound for $(x_i)$ by $\hat\sigma$ ? — Stéphane Laurent, Jan 09 '13 at 18:36
I missed that point. Your answer is indeed correct: you only need to add a constant to a confidence limit for the coefficient of variation $\mu/\sigma$. (+1 to your answer for that.) However, I have long wanted to have an exposition of the basic theory of tolerance intervals somewhere on this site, so if you don't mind, I would like to keep my answer open: in that way this thread will provide a reference for several kinds of tolerance intervals. — whuber, Jan 09 '13 at 19:36

Tolerance bound for a normalized variable

2 Answers2

Software

References

Linked

Related