confidence intervals' coverage with regularized estimates

Question

Suppose I'm trying to estimate a large number of parameters from some high-dimensional data, using some kind of regularized estimates. The regularizer introduces some bias into the estimates, but it can still be a good trade-off because the reduction in variance should more than make up for it.

The problem comes when I want to estimate confidence intervals (e.g. using Laplace approximation or bootstrapping). Specifically, the bias in my estimates leads to bad coverage in my confidence intervals, which makes it hard to determine the frequentist properties of my estimator.

I've found some papers discussing this problem (e.g. "Asymptotic confidence intervals in ridge regression based on the Edgeworth expansion"), but the math is mostly above my head. In the linked paper, Equations 92-93 seem to provide a correction factor for estimates that were regularized by ridge regression, but I was wondering if there were good procedures that would work with a range of different regularizers.

Even a first-order correction would be extremely helpful.

+1 timely and important question - though I'm not sure anyone can at present answer this in the affirmative (I guess we simply don't know how to do that properly and if I knew, I'd have a couple of Annals of Statistics papers lined up). Related question: http://stats.stackexchange.com/questions/91462/standard-errors-for-lasso-prediction-using-r We do know that bootstrapping performs purely in such situations but that won't help. — Momo, Jun 22 '15 at 16:40
Thanks for the link. Could you clarify what you meant regarding bootstrapping? — David J. Harris, Jun 22 '15 at 16:55
Also, I'm still holding out hope that someone could have methods that work well for non-sparse regularizers. I'd imagine that the L1 penalty makes things especially difficult because of all the estimates piled up at zero. Thanks again. — David J. Harris, Jun 22 '15 at 17:02
Sure, re bootstrap (I'm just quoting from here: http://cran.r-project.org/web/packages/penalized/vignettes/penalized.pdf) " The reason for this is that standard errors are not very meaningful for strongly biased estimates such as arise from penalized estimation methods. Penalized estimation is a procedure that reduces the variance of estimators by introducing substantial bias." — Momo, Jun 22 '15 at 17:45
Unfortunately, in most applications of penalized regression it is impossible to obtain a sufficiently precise estimate of the bias. Any bootstrap-based calculations can only give an assessment of the variance of the estimates [me: Not entirely, in large samples it can also correct bias]. Reliable estimates of the bias are only available if reliable unbiased estimates are available, which is typically not the case in situations in which penalized estimates are used." — Momo, Jun 22 '15 at 17:45
Reporting a standard error of a penalized estimate therefore tells only part of the story. It can give a mistaken impression of great precision, completely ignoring the inaccuracy caused by the bias. It is certainly a mistake to make confidence statements that are only based on an assessment of the variance of the estimates, such as bootstrap-based confidence intervals do. Methods for constructing reliable confidence intervals in the high-dimensional situation are, to my knowledge, not available." Still, the question is extremely important-I hope I'm wrong. Thx 4 bringing it up. — Momo, Jun 22 '15 at 17:47
Dave, would the so-called *selection* intervals of Tibshirani and coauthors be suitable? They've worked them out for at least the Lasso, LARS, & step-wise regression via something called a *polyhedral form*. From that you can essentially form confidence intervals in the usual way but using a truncated normal with limits $c$ & $d$ informed from the data. Superficial details plus links to actual papers (most of which are on ArXiv) are in [Taylor & Tibshirani (2015, PNAS)](http://doi.org/10.1073/pnas.1507583112). — Gavin Simpson, Jun 23 '15 at 20:51
The paper by [Ruben Dezeure, Peter Bühlmann, Lukas Meier and Nicolai Meinshausen](http://arxiv.org/abs/1408.4026) is to the best of my knowledge the most recent and comprehensive account on inference in a high-dimensional setting. — NRH, Jun 25 '15 at 09:58

score 4 · Answer 1 · answered Jun 28 '15 at 22:01

4

There is a recent paper which address precisely your question (if you want to perform regression on your data, as I understand) and, luckily, provides expressions which are easy to calculate (Confidence Intervals and Hypothesis Testing for High-Dimensional Regression).

Also, you may be interested in the recent work by Peter Bühlmann on that very topic. But I believe that the first paper provides you with what you are looking for, and the contents are easier to digest (I am not an statistician either).

answered Jun 28 '15 at 22:01

jpmuc

12,986
1
34
64

+1 Interesting paper. So, it appears there are at least three competing ideas of how to approach these problems and from what I can see they are not closely related. Then there also is the impossibility theorem from http://journals.cambridge.org/action/displayAbstract?fromPage=online&aid=280988&fileId=S0266466605050036 Will be interesting to see how this plays out and what emerges as canonical. – Momo Jun 28 '15 at 22:21
Thanks. This may not be something I'm actually able to implement, but it does seem like the math works for a variety of regularized estimates. – David J. Harris Jun 29 '15 at 23:21

score 1 · Answer 2 · answered Jun 28 '15 at 15:45

1

http://cran.r-project.org/web/packages/hdi/index.html

Is this what you're looking for?

Description
Computes confidence intervals for the l1-norm of groups of regression parameters in a hierarchical
clustering tree.

answered Jun 28 '15 at 15:45

Tagar

143
7

I was hoping for something that would work for a variety of (mostly non-sparse) regularizers. Thanks though. – David J. Harris Jun 29 '15 at 23:21

confidence intervals' coverage with regularized estimates

2 Answers2