Is there a theoretical motivation for how we construct confidence regions?

Question

I've recently had to construct a confidence region for a vector of means $\theta \in \mathbb{R}^k$, and I realized my understanding of some concepts regarding the fundamentals of building confidence regions is lacking. Loosely speaking, my question is: why do we construct confidence regions the way we do, and why can't I construct them in another way? In particular, I am interested in understanding the underlying reasons for them to be sets that are connected, or possess other properties such as being ellipsoidal, as is the case when constructing a confidence regions for asymptotically gaussian estimators. I have quite a few questions, though I believe they are all related, and I would greatly appreciate any insights or advice about resources to check out.

To motivate ideas, lets first consider a confidence region (CR) for a real-valued parameter. Given data $X_1,\dots,X_n$, I can create a 95% CI for a parameter $\theta$ by choosing sample statistics $L_n(x_1,\dots,x_n)$ and $U_n(x_1,\dots,x_n)$ such that $$P(L_n \leq \theta \leq U_n) = .95,$$ and we call $[L_n,U_n]$ a 95% CR. A first question is:

1. What is stopping me from building a 5% CR $[L_n(.05),U_n(.05)]$ and then taking my 95% CR to be $CR_1 \equiv \mathbb{R} - [L_n(.05),U_n(.05)]$, where $A-B$ denotes $A$ minus the subset $B$?

Clearly this set has 95% coverage. One answer I guess is that encoded in the definition of a CR is that $L_n \leq \theta \leq U_n$, i.e. some notion of the CR being a connected set, but is there a formal reason for wanting that property?

A second question:

2. Why are CRs for some parameters (normally distributed) symmetric, while others are not (bootstrap CRs, or maybe binomial distribution CRs)? In higher dimensions, what determines the shape?

Here, my sense is that the answer is that given a measurable space, we would want to minimize the measure of the CR, but I'm not sure. I know about highest density regions (HDRs) but again I don't seem to see why this has to be the way to do things. Also, there are certainly ways to construct different shapes with the same measure, so this can't be enough (just think of a rectangular CR and just rotate it).

These questions become harder to think about in higher dimensions, where there are many sets that are connected (rectangle? ellipse? polygon?) and have some notion of symmetry, and I can certainly imagine that there is a trade off where some set shapes have smaller measure (question 2) but at the cost of having a more unnatural shape (generalization of question 1). And here's the kicker for my situation:

I want to construct a 95% CR for a vector of means, but my end goal is to create a 95% CR for a function of these means. My thought was to build a 95% CR for the means and plug them into my function to build a 95% CR for it. In playing around with my function, it's clear that how I choose to create my 95% CR (ie choosing an ellipoidal CR vs a rectangular one vs a really odd connected set I forced in my code) implies very different results for the size of the CR of the function. For example, if the target parameter was $f(x,y) = x+2y$ and the CR for $(x,y)$ is a rectangle, then rotating it will affect the CR of the function.

This may motivate me to "cheat" by choosing a CR with a shape that favors what I'm hoping to show. More generally, I guess this is touching on problems if you invert CRs to perform hypothesis testing.. in the univariate case, what if I choose to build CRs that extend far in the positive numbers, helping me "artificially" reject that the CR does not contain 0?

Added a last paragraph, but I guess the same comments apply to hypothesis testing? — doubled, Nov 08 '20 at 22:07

Is there a theoretical motivation for how we construct confidence regions?

0 Answers0