1

I have a simple question.

Is the assumption of sparsity only useful when p > n, that is when you have a large number of features compared to observation.

When I use a Uniform prior, I can see that a lot of coefficients (that held be zero) are not estimated to be zero. I was wondering if a sparsity assumption might help alleviate this problem.

However, re-running with a sparse Laplace prior does not shrink the coefficients as much as I would have hoped. I am re-running with more mass on 0 for the Laplace prior, but not sure if in the regime of n >> p this will be useful.

Any suggestions/comments will be very much appreciated. Thanks!

asifzuba
  • 323
  • 1
  • 6

1 Answers1

3

There is interesting review paper by Van Erp et all (2019), who compare different shrinkage priors (a.k.a. sparse) and their effect on the estimated parameters. One of their conclusions was that for $p > n$ scenarios there are better alternatives then Ridge (Gaussian) or Lasso (Laplace) priors when it comes to variable selection. In some cases those priors are not enough for variable selection, or sparsity. So yes, it is possible that using Laplace would be "not enough" to achieve sparsity.

Van Erp, S., Oberski, D. L., & Mulder, J. (2019). Shrinkage Priors for Bayesian Penalized Regression. Journal of Mathematical Psychology, 89, 31-50. doi:10.1016/j.jmp.2018.12.004 (preprint)

Tim
  • 108,699
  • 20
  • 212
  • 390
  • Ah, thank you @Tim . I will deifnitely hunt aht reference down. However, my situation is $n >> p$ and I was wondering if I just need to keep the penalty really high to get any real sparsity in my data. – asifzuba May 26 '20 at 21:51
  • @asifzuba same applies: maybe you should use prior that forces more sparsity. – Tim May 26 '20 at 22:08