0

I have a model for longitudinal data with 10,000 random effects and 25 additional parameters. The sample size is 30,000. I have tried to use Stan, MCMCPack and others to sample from the posterior distribution but they are all veeeeeeeery slow.

The model is complex so it does not allow for conjugate priors.

What alternative samplers or methods can I consider?

Sadler
  • 1
  • When you say slow, how long does it take?? Also, can you please give some information about the structure of the model?? – Fiodor1234 Nov 29 '20 at 20:23
  • @Fiodor1234 It would take a couple of weeks to sample from the posterior. This is too long for what I need. The model is a longitudinal model with non-standard choices for the distributions implied in the model. I was hoping for some general suggestions. – Sadler Nov 29 '20 at 20:28
  • 1
    What value for iter in stan(..., iter=?)? I often use 5000 and in almost cases it is fine for me. – TrungDung Nov 29 '20 at 21:21

1 Answers1

2

George Ho gives a very good set of advice and useful links for solving many common problems with MCMC on his blog. The post is lengthy and gives a lot of good advice, so it would be hard to summarize it, but in many cases you could re-parametrize model, choose different priors, or maybe use algebra to simplify some parts of the model, what could lead to improvements. As mentioned by George Ho, the great Stan's manual also has a chapter on this topic.

If this doesn't work, what people do in many cases where sampling would take unreasonably long, is they use methods to approximate the posterior instead of sampling from it. In such cases variational inference is commonly used (see e.g. Blei et al, 2016), among those methods ADVI (Kucukelbir et al, 2016), designed by authors of Stan and available in Stan, is among the most popular ones. It is significantly faster, but the price you pay is that this is just an approximation, and in some cases it might work better, while in other cases it might work worse (and it might be hard to diagnose it, e.g. Yao et al, 2018).

See also Variational inference versus MCMC: when to choose one over the other?

Tim
  • 108,699
  • 20
  • 212
  • 390