Based on a sample $(x_i) \sim_{\text{iid}} {\cal N}(\mu, \sigma^2)$, how can you get an exact or a well-approximated upper tolerance bound (i.e. an upper confidence bound of a quantile of the distribution) of $\dfrac{x_i}{\sigma}$ ? Here, we assume both parameters $\mu$ and $\sigma^2$ are unknown.
-
Presumably, both parameters are unknown. – whuber Nov 29 '12 at 23:49
-
@whuber Yes they are. – Stéphane Laurent Nov 30 '12 at 08:04
-
1You can calculate likelihood-confidence intervals of this quantity by using that $x_i/\sigma \sim N(\mu,1)$. Using this, you have that the $p$th quantile of $x_i/\sigma$ is $q=\mu+\Phi^{-1}(p)$, then you can reparameterise the likelihood of $(\mu,\sigma)$ in terms of $(q,\sigma)$ and to obtain the profile likelihood of $q$ and the corresponding approximate confidence interval by the usual methods. There must be an exact approach but I do not know it. – Jan 03 '13 at 11:34
-
2Thanks @Procrastinator. As you can see I answer my own question. Starting a bounty when you have a cold is not a good idea. – Stéphane Laurent Jan 03 '13 at 11:41
2 Answers
Damn! After starting the bounty I realized the answer is easy: The quantile has form $\frac{\mu}{\sigma}+z_p$, and there are some known methods to get a confidence interval about the "effect size" $\frac{\mu}{\sigma}$ (or by inverting the bounds of a confidence interval about the coefficient of variation; such a confidence interval is available in the R package MBESS
).

- 24,973
- 8
- 94
- 128

- 17,425
- 5
- 59
- 101
In Environmental Statistics with S-Plus (CRC Press, 2001), Steven Millard and Nagaraj Neerchal state that a
$\beta$-content tolerance interval with confidence level $(1-\alpha)100$% is constructed so that it contains at least $\beta 100$% of the population (i.e., the coverage is at least $\beta 100$%) with probability $(1-\alpha)100$%.
Citing several sources (Wald & Wolfowitz 1946, Guttman 1970, Gibbons 1994), they give an approximate two-sided tolerance interval for $n$ observations in the usual form of (sample mean) + $K$ (sample standard deviation) where
$$K = r \sqrt{\frac{n-1}{\chi^2_{n-1, \alpha}}}.$$
Here, $\chi^2_{n-1,\alpha}$ is the $\alpha$ quantile of a chi-square distribution with $n-1$ degrees of freedom and $r$ solves the equation
$$\Phi(\frac{1}{\sqrt{n}} + r) - \Phi(\frac{1}{\sqrt{n}} - r) -\beta$$
(with $\Phi$ the standard Normal CDF).
Wald and Wolfowitz (1946) show that this approximation is quite good, even for values of $n$ as small as $2$, provided both $\beta$ and $1-\alpha$ are greater than $0.95$. Furthermore, Ellison (1964) shows that for this approximation, the error in the confidence level is on the order of $1/n$.
Millard & Neerchal also discuss the simpler case of a one-sided tolerance interval, where $K$ is given by
$$K = \frac{t_{n-1, z_\beta \sqrt{n}, 1-\alpha}}{\sqrt{n}}.$$
The notation $t_{\nu, \delta, p}$ refers to the $p^\text{th}$ quantile of the non-central Student t distribution with $\nu$ degrees of freedom and noncentrality parameter $\delta$, and $z_p$ refers to the $p^\text{th}$ quantile of the standard Normal distribution, $z_p = \Phi^{-1}(p)$. This directly answers the present question.
Software
I provide an Excel (VBA) macro for these (Normal-theory) intervals in a spreadsheet at http://www.quantdec.com/envstats/software/intervals.xls. Although I have not tried it, the R
package tolerance appears to offer a broad suite of tolerance intervals (these formulas need to be modified to apply to regression residuals, for instance).
The non-central Student t distribution is available via the ncp
argument through the qt
function (and its relatives) found in the base R
installation.
References
Wald, A., and J. Wolfowitz. (1946). Tolerance Limits for a Normal Distribution. Annals of Mathematical Statistics 17, 208-215.
Guttman, I. (1970) Statistical Tolerance Regions: Classical and Bayesian. Hafner Publishing Co., Darien, CT.
(I omit the Gibbons reference because it has typographical errors and adds nothing fundamentally different.)
-
Thanks for providing these informations. But I don't see how you apply the classical theory for a Gaussian sample to the case of my question. Do you claim you divide the classical tolerance bound for $(x_i)$ by $\hat\sigma$ ? – Stéphane Laurent Jan 09 '13 at 18:36
-
1I missed that point. Your answer is indeed correct: you only need to add a constant to a confidence limit for the coefficient of variation $\mu/\sigma$. (+1 to your answer for that.) However, I have long wanted to have an exposition of the basic theory of tolerance intervals somewhere on this site, so if you don't mind, I would like to keep my answer open: in that way this thread will provide a reference for several kinds of tolerance intervals. – whuber Jan 09 '13 at 19:36
-