2

(This is more of a remark than question, essentially I want to know if there are any works in this direction)

I find method like linear regression unsatisfactory mainly because the prediction for y given x is usually some version of E(Y|X=x).

I think a far better alternative would be to specify a confidence interval of a specific level. But I am fairly convinced that given a confidence level (say 95%), there are multiple ways to come up with a desired interval. To give a pictorial intuition lets say:

Unit square (with area 1) represents our uniform probability space. Then we have multiple subregions which cover 0.95 unit area. Which one is better? We can say that given a area patch is better if it contains means, has less variance, is connected which is essentially alluding to the fact that an area patch minimizes a certain (user given) cost function. It could be my ignorance but I am not seen these kind of question been answered in any textbook.

  • E(Y|X=x) implies that Y has a distribution (conditioned on X), which in the case of least-squares regression is Gaussian $Y \sim N(X\beta, \sigma)$, so you can get a predictive interval from the quantiles of that distribution (or just use $\sigma$). Exploiting the predictive distribution of a model is often very useful. Bayesian credible intervals seem more satisfying than confidence intervals for those sorts of applications though as you are usually interested in the predictions made by that model, rather than a statement about a (fictitious) population of models. – Dikran Marsupial Apr 24 '21 at 15:10
  • Most forms of regression, including linear regression, have confidence boundaries on both the regression line (or curve) and on predictions for a given value of $x$. I wonder if the dissatisfaction you feel with $E[y|x]$ is perhaps a dissatisfaction with the way regression results are *presented*, rather than with regression itself? – Alexis Apr 24 '21 at 16:51
  • I don't think that whether the interval contains the true mean is helpful as a criterion for evaluating confidence interval generation processes, since any process for generating a 95% confidence interval has a 95% chance of generating an interval which contains the true mean. – fblundun Apr 24 '21 at 17:08

1 Answers1

5

You are correct that there is no unique choice of a confidence interval. As Ben says in this answer: "Generally speaking, there are an infinite number of possible 95% confidence intervals you could formulate."

There is much interest in trying to find the "shortest" confidence interval in some sense. This thread has extensive discussion of that issue, including examples from some standard types of estimation, discussion of just what is meant by a "shortest" interval (along with many useful links for further study), and a broader discussion of "optimal" confidence intervals based on the pivotal quantities from which confidence intervals are estimated (and thus extensible to whatever definition of "best" confidence interval you wish to use).

EdM
  • 57,766
  • 7
  • 66
  • 187