Obtaining $p$-values for UMP $t$ tests for equivalence

Question

I am trying to understand the uniformly most powerful (UMP) equivalence tests presented by Welleck (2010) with respect to $p$-values for the tests. In the case of the paired $t$ test example concerning measures before and after an intervention (pp 94–96) Wellek presents a UMP one-sample $t$ test of mean equivalence with symmetric equivalence boundaries where $ -\theta_{1} = \theta_{2} = \varepsilon = 0.5$, for the negativist null hypothesis H$^{-}_{0}\text{: }\left|t\right| \ge \tilde{C}_{\alpha,\nu}(\varepsilon)$ and gives:

$n=23$ (sample size)
$t=\frac{0.16}{\frac{3.99}{\sqrt{23}}} = 0.1923$ (test statistic)
$\tilde{C}_{.05, 22}(0.5) = \sqrt{F^{-1}_{\nu_{\text{n}},\nu_{\text{d}},\psi^{2}}(0.05)} = 0.7595$ (critical value),

where $F^{-1}$ is the quantile function (inverse CDF) of the noncentral F distribution, $\nu_{\text{n}}= 1$, $\nu_{\text{d}}= n-1 = 22$, and $\psi^{2} = n\varepsilon^{2} = 5.75$.

We would therefore reject H$^{-}_{0}$ in favor of H$^{-}_{\text{A}}$, and conclude that, for $\alpha=0.05$ and $\varepsilon = 0.5$, mean measurements before and after intervention were equivalent (we would also draw this conclusion in a combined test of H$^{-}_{0}$ and H$^{+}_{0}$, since we would reject the first, but not the second).

What is the $p$-value of this test? My guess is that it is [edited to remove the "$\mathbf{1-}$" confusion I had... forgot which tail I was in]:
$p = \text{P}(\left|T\right| < \left|t\right|) = F_{\nu_{\text{n}},\nu_{\text{d}},\psi^{2}}(t^2) = 0.00882$,

where $F$ is the CDF of the noncentral F distribution.

Is this the correct $p$-value for this specific example, and is this the way to approach $p$-values generally for t tests for equivalence?

In the non-symmetric equivalence interval case, Wellek describes an iterative process to numerically solve for the values $C_{1}$ and $C_{2}$, corresponding to the rejection regions H$^{-}_{0}\text{: } t \le C_{1}$ or $t \ge C_{2}$. Any suggestions for how to obtain the $p$-values for $t$ in this case?

Update: would: $\psi = n\left|\theta_{1}\theta_{2}\right|$ in (4) above in the asymmetric case?

Update 2: Bounty for explicitly describing the steps required to iteratively estimate $p$-values in the asymmetric case as hinted at in Horst's response, including how to solve $\min_{i \in l,u}\left|T−t_{i}\right|=0$.

References

Wellek, S. (2010). Testing Statistical Hypotheses of Equivalence and Noninferiority. Chapman and Hall/CRC Press, second edition.

score 3 · Answer 1 · edited Apr 29 '14 at 15:18

3

For all kinds of tests, the $p$-value can be imagined as the particular $\alpha$ for which the present data's test would be on the edge between significance and nonsignificance. In other words: The $p$-value is the $\alpha$ that makes the critical value coincide with the test statistic. So if the test statistic of your data is $T$, the $p$-value is $F_{1,n-1,\psi^{2}=n\epsilon^{2}}(T^2)$.

Your guess is inspired from point hypothesis testing: There you can calculate $P_{H_0}(|T|<|t|)$. Here, $H_0$ is no more just a point in the parameter space. Likewise, $P_{H_0}$ is a set of different probability measures. (Also, I could not reproduce your evaluation of the cdf-F function.)

For the case of assymetric equivalence regions, it's basically similar but since the critical values change asymetrically as $\alpha$ changes, calculation of the $p$-values would require nested iterative calculations. Namely, let $t_l(\epsilon_l, \epsilon_u, \alpha)$ be the lower critical value calculated iteratively, similarly $t_u$. A crude approach to find the $p$-value would be to solve $\min_{i\in {l,u}} |T-t_i| = 0$ numerically for $\alpha$ e.g. by Newton's method. This $\alpha$ would be the $p$-value, because $T$ is on the edge of significance iff it conicides with one bound of the critical region. That's an expensive approach, though.

Note that due to FDA/ICH guidelines, the easier, more flexible (but also more conservative) TOST should be preferred to this exact test in biostatistical applications.

edited Apr 29 '14 at 15:18

Alexis

26,219
5
78
131

answered Apr 28 '14 at 15:12

Horst Grünbusch

5,020
17
22

Horst, do you mean $\delta = n\varepsilon^{2}$? (see page 96, second paragraph). And if you do mean $\varepsilon^{2}$, then why this difference, when the statistic itself uses $\psi$ as I have reproduced from the text? Also: could you (or has someone else) elaborate on the iterative process in the asymmetric case? I have played (extensively) with Wellek's code for asymmetric intervals, but he leaves out the $p$-values. – Alexis Apr 28 '14 at 22:18
I think $\delta$ must indeed be $\psi (= n\varepsilon^{2}$) and not merely $\varepsilon^{2}$, otherwise for Wellek's example (Rejected H$^{-}_{0}$) we get $p=0.133$ which cannot be for $\alpha=0.05$. – Alexis Apr 28 '14 at 22:32
I think I had p. 94 in mind. Unfortunately, he somehow changed notation. Just edit my answer and I'll check it. – Horst Grünbusch Apr 28 '14 at 23:03
In the asymmetric case, does $\psi = n\left|\theta_{1}\theta_{2}\right|$? For example, in the asymmetric example referenced at the bottom of page 96, and implemented in the `tt1st` file, would $\psi = 36\left|-0.4716\times 0.3853\right| = 6.542$? – Alexis Apr 28 '14 at 23:20
@All: The source code of tt1st can be found at http://www.crcpress.com/product/isbn/9781439808184. I did not see there why $psi=n|\theta_1 \theta_2|$. (Also I might unintentionally have rejected your edit as I wanted to look it first up in the book.) – Horst Grünbusch Apr 29 '14 at 10:39
Right, the book says that for the symmetric case where $-\theta_{1} = \theta_{2} = \varepsilon$, that $\psi = n\varepsilon^2$. But I also want to calculate $p$-values for the asymmetric case. Wellek does not provide a definition of $\psi$ there (and does not aim at $p$-values in any case), so $\psi=n\left|\theta_{1}\theta_{2}\right|$ is a speculation as part of my question on how to evaluate in the asymmetric case. While $C_{1}$ and $C_{2}$ are iteratively calculated, it is not clear that I require these values: I want to measure a probability on $t$, not on $C_{1}$ and $C_{2}$. – Alexis Apr 29 '14 at 13:17
Just try to calculate the $p$-values in the assymetric case the way I suggested. There cannot be an accessible formula since it must reflect the assymetry between the equivalence bounds the same way as $C_1$ and $C_2$ do. Just imagine how you have to change one bound in order to keep the $p$-value constant as the other bound is moving. Your speculation would suggest something like $\theta_1 = \theta^{-1}_2$, but this is too simple to be true. – Horst Grünbusch Apr 29 '14 at 15:10
I do not understand your suggestion (appologies if I am being obtuse). For example, just as in the asymmetric case $p = F_{1,\nu,\psi}(t^{2})$ is not a function of the critical value $\tilde{C}$, shouldn't $p$ in the asymmetric case likewise not be a function of the critical values $C_{1}$ and $C_{2}$? – Alexis Apr 29 '14 at 15:13
Even in the symmetric case the $p$-value will be calculated similar to the critical values. It is not mere conincidence that both involve $F$-distributions. That's also why in the assymetric case the $p$-value will inherit this complicacy from the critical values. – Horst Grünbusch Apr 29 '14 at 15:24
I cannot log in to chat. Your comment about $p$ values in the symmetric case makes no sense: your own definition in your answer **does not involve** $\tilde{C}$. – Alexis Apr 29 '14 at 15:29
Remember: To get the $p$-value, you have to find the $\alpha$ that makes the critical value coincide with the test statistic. Just look up the definitions of $p$-value and type-I-error. So clearly calculating the $p$-value involves the same calculation process as to the critical value, in the symmetric case the evaluation of $F$ or $F^{-1}$. In the assymetric case, it's much more complicated than a single $F$-distribution. You can try the iterative process I suggested. – Horst Grünbusch Apr 29 '14 at 15:58
I get that they both use the CDF, my point is that I do not need to know the critical rejection value (a researcher choice) in order to measure the probability of observing my test statistic if the null hypothesis is true (a function of the data plus H$_{0}$). In difference tests one does not use $\alpha$ or $t_{\alpha}$ to calculate $p$. Similarly, in the symmetric case one does not use $\tilde{C}$ to calculate $p$. I am skeptical that in the asymmetric case a researcher decision (rejection threshold) is now part of calculating $p$. – Alexis Apr 29 '14 at 16:34
Please read carefully what I wrote. I never wrote that you need to know the critical value. But $p\geq\alpha$ is equivalent to the test statistic $T$ being outside the critical region $\mathcal{C}_{\alpha}$. And vice versa $p=p_0$ is equivalent to $T$ laying on the border of $\mathcal{C}_{p_0}$. So if you cannot compute the critical value in case of $\alpha=p_0$ other than by an iterative process, you also cannot compute a $p$-value in closed form that turns out to be $p_0$. – Horst Grünbusch Apr 29 '14 at 18:42
Ok, I *think* the fog begins to clear. I still need more help, though, because I don't understand how to "solve $\min_{i\in {l,u}} |T-t_i| = 0$ numerically for $\alpha$ e.g. by Newton's method". How can I begin to outline the steps needed (assuming I can do this, I might move on to fester numerical methods, yes?). – Alexis Apr 30 '14 at 16:03
Clearly, the left hand side of $|\min_{i\in{l,u}}|T-t_{i}(\epsilon_l,\epsilon_u,n,p)|$ isn't differentiable due to $||$ and $\min$. You can strip these parts. Ask another question in math.stackexchange.com if you don't see how, because this numerical problem may occur elsewhere as well. – Horst Grünbusch May 02 '14 at 12:53

score 2 · Accepted Answer · edited May 09 '14 at 17:27

Calculating the $p$-values for asymmetric equivalence bounds is a nested process. First, one has to set up the procedure to find the critical values for each $\alpha^*$. Note that this $\alpha^*$ is not the intended type I error but a placeholder for any such level. Then, one has to find the $\alpha^*$ such that one of the critical values equals the test statistic.

Procedure for Critical Values

Remember that the power function (or rejection probability) $\beta_.(\delta)$ for a point hypothesis test with test statistic $T$ and critical region $\mathbb{R}\setminus[C_1(\alpha),C_2(\alpha)]$ is $$\beta_.(\delta)=P_{\delta}(T>C_2(\alpha))+P_{\delta}(T<C_1(\alpha)).$$ Turning to an equivalence test, we somehow change hypothesis and alternative. For point hypotheses, the rejection probability $\alpha$ had to be maintained at the point the hypothesis was formulated about. For equivalence tests, it is the border of the equivalence region $]\epsilon_1,\epsilon_2[$ where the rejection probability has to be equal to $\alpha$. Then it will be $\leq\alpha$ also for the rest of the hypothesis space. That means that we have to plug the $\epsilon$s into the noncentrality parameters. So in order to get the equivalence test's critical values for a $\alpha^*$, we have to solve both $$T_{df,\epsilon_1}(C_1) - T_{df,\epsilon_1}(C_2) = \alpha^*$$ $$T_{df,\epsilon_2}(C_1) - T_{df,\epsilon_2}(C_2) = \alpha^*,$$ where $T_{df,\delta}$ denotes the CDF of a $t$-distribution with $df$ degrees of freedom and noncentrality parameter $\delta$. Numerically this can be done by the Newton-(Raphson)-Algorithm for the mapping $$h_{\alpha^*}(C_1,C_2) = (T_{df,\epsilon_1}(C_1) - T_{df,\epsilon_1}(C_2) -\alpha^*, T_{df,\epsilon_2}(C_1) - T_{df,\epsilon_2}(C_2) - \alpha^*)$$ In R and with df as degree of freedom, e1 and e2 as lower and upper equivalence margin and library(rootSolve) loaded, this can be implemented by

h <- function(c1,c2){ c(pt(c2,df=df,ncp=e1) - pt(c1,df=df,ncp=e1) - alphastar pt(c2,df=df,ncp=e2) - pt(c1,df=df,ncp=e2) - alphastar) } c <- multiroot(h,c(e1,e2))\$root #please ignore the backslash

Note that this involves numerical derivation of pt. It will be computationally faster and more precise to program the Newton algorithm by hand here. Then the function dt, the density of the $t$-distribution, can be used directly in the Jacobian of h.

$p$-values

Now as the $p$-value is the particular $\alpha^*$ that makes test statistic T and either critical value coincide, one has to solve $C_1(\alpha_1^*)+T=0$ and $C_2(\alpha_2^*)+T=0$. Then the $p$-value is the minimum of both $\alpha^*$s. So you have to nest the critical value algorithm within the solving process of $T-C_1(\alpha^*)$. In R this can look like this (e1, e2, df, test statistic T fixed as before):

hC1 <- function(alphastar){ #previous code block goes here C <- multiroot(h,c(e1,e2))\$root T+C[1] } p1 <- multiroot(hC,1).root hC2 <- function(alphastar){#previous code block goes here C <- multiroot(h,c(e1,e2))\$root T+C[2] } p2 <- multiroot(hC2,1)\$root pval <- min(p1,p2) In the end, pval is the asymmetric $p$-value. Maybe it is possible with some heuristics to tell in advance if p1 od p2 is smaller. Then one need only to calculate the respective value. But since the equivalence region is asymmetric, I'm not sure about it.

+1 mega-kudos, Horst. Do you have thoughts about initial values for $\alpha^{*}$? — Alexis, May 09 '14 at 17:20
The critical values are strictly monotonous in $\alpha^*$ so "good" initial values don't change the result, they would spare you only some iterations. Maybe 0.5 is faster than what I suggested, 1. — Horst Grünbusch, May 11 '14 at 15:56

Obtaining $p$-values for UMP $t$ tests for equivalence

2 Answers2

Procedure for Critical Values

$p$-values

Linked