Intuition for why confidence intervals can be constructed by inverting tests

Question

Setup

We observe an observation of a random variable $X$ assumed to be distributed $X \sim F(\cdot | \theta)$ where $\theta \in \Theta$ is unknown and the parameter of interest. We want to define a confidence set $C(X) \subset \Theta$ with coverage probability $1 - \alpha$ or formally: $$ \inf_{\theta \in \Theta} \text{Pr}_\theta (\theta \in C(X)) \ge 1 - \alpha $$ One way to construct such a confidence interval is through test inversion. Specifically, we can construct a confidence interval as all $\theta_0 \in \Theta$ such that we cannot reject the null hypothesis that $\theta = \theta_0$ at level $\alpha$ or: $$ C(X) = \{\theta_0 \in \Theta: H_0 : \theta = \theta_0 \text{ is not rejected}\} $$ Assuming that the hypothesis tests are valid, this confidence interval has a coverage probability of greater than or equal to $ 1 - \alpha$ because for any $\theta \in \Theta$: $$ \text{Pr}_\theta(\theta \in C(X)) = \text{Pr}_\theta(\text{we don't reject } \theta = \theta_0) \ge 1 - \alpha $$ where the last inequality follows because the hypothesis test is valid.

Question

While the above definitions make sense, I have always been confused how to reconcile this with my understanding of hypothesis tests. Specifically, my understanding of hypothesis tests is that their results and corresponding p-values tell you nothing about the probability that the null-hypothesis is true (or $\text{Pr}(\theta = \theta_0 | X)$). Yet, the confidence interval constructed via test inversion gives you a subset of $\Theta$ where we know the probability that $\theta$ is in this subset. This seems contradictory.

Does anyone have any intuition (or can correct a mistake somewhere) for how to reconcile this understanding of hypothesis tests with valid confidence intervals from test inversion?

It's contradictory because you misunderstand what a CI is. Please visit https://stats.stackexchange.com/questions/tagged/confidence-interval?sort=votes . — whuber, Aug 15 '17 at 21:35

score 2 · Accepted Answer · answered Aug 16 '17 at 16:09

First, note that there are various ways to try to test a dataset, and there are various ways to try to construct a confidence interval. As a result, it is possible for a test to be inconsistent with a confidence interval (see: Is rejecting the hypothesis using p-value equivalent to hypothesis not belonging to the confidence interval?). That said, you can create a confidence interval using the method you suggest, and they would obviously thus correspond to each other.

Your interpretation of hypothesis testing ("their results and corresponding p-values tell you nothing about the probability that the null-hypothesis is true") is correct. Your confusion lies in your description of the nature of a (frequentist) confidence interval. The description is reasonable, as far as it goes, but there is a subtle ambiguity that is leading you astray. Namely, the probability in the definition of the confidence interval pertains to the long-run frequency of confidence intervals that cover the true value of $\theta$. It is not the probability that any given confidence interval does. A formed confidence interval is a realization (a set of realized values). It either includes the true value of $\theta$ (with probability $1$) or it doesn't (in which case the probability is $0$), but you don't know which. It may help to read, Why does a 95% Confidence Interval (CI) not imply a 95% chance of containing the mean?

Intuition for why confidence intervals can be constructed by inverting tests

1 Answers1

Linked