Questions tagged [hierarchical-bayesian]

Hierarchical Bayesian models specify priors on parameters and hyperpriors on the parameters of the prior distributions

624 questions
81
votes
2 answers

XKCD's modified Bayes theorem: actually kinda reasonable?

I know this is from a comic famous for taking advantage of certain analytical tendencies, but it actually looks kind of reasonable after a few minutes of staring. Can anyone outline for me what this "modified Bayes theorem" is doing?
eric_kernfeld
  • 4,828
  • 1
  • 16
  • 41
34
votes
2 answers

What's the difference between "deep learning" and multilevel/hierarchical modeling?

Is "deep learning" just another term for multilevel/hierarchical modeling? I'm much more familiar with the latter than the former, but from what I can tell, the primary difference is not in their definition, but how they are used and evaluated…
21
votes
1 answer

Why LKJcorr is a good prior for correlation matrix?

I´m reading the chapter 13 "Adventures in Covariance" in the (superb) book Statistical Rethinking by Richard McElreath where he presents the following hierarchical model: (R is a correlation matrix) The author explains that LKJcorr is a weakly…
xboard
  • 1,008
  • 11
  • 17
19
votes
2 answers

What prior distributions could/should be used for the variance in a hierarchical bayesisan model when the mean variance is of interest?

In his widely cited paper Prior distributions for variance parameters in hierarchical models (916 citation so far on Google Scholar) Gelman proposes that good non-informative prior distributions for the variance in a hierarchical Bayesian model are…
Rasmus Bååth
  • 6,422
  • 34
  • 57
18
votes
1 answer

In Gelman's 8 school example, why is the standard error of the individual estimate assumed known?

Context: In Gelman's 8-school example (Bayesian Data Analysis, 3rd edition, Ch 5.5) there are eight parallel experiments in 8 schools testing the effect of coaching. Each experiment yields an estimate for the effectiveness of coaching and the…
Heisenberg
  • 4,239
  • 3
  • 23
  • 54
18
votes
2 answers

What is the problem with empirical priors?

In literature I sometimes stumple upon the remark, that choosing priors that depend on the data itself (for example Zellners g-prior) can be criticized from a theoretical point of view. Where exactly is the problem if the prior is not chosen…
muffin1974
  • 1,152
  • 11
  • 24
17
votes
2 answers

Bayesian estimation of $N$ of a binomial distribution

This question is a technical follow-up of this question. I have trouble understanding and replicating the model presented in Raftery (1988): Inference for the binomial $N$ parameter: a hierarchical Bayes approach in WinBUGS/OpenBUGS/JAGS. It is not…
COOLSerdash
  • 25,317
  • 8
  • 73
  • 123
15
votes
3 answers

Multinomial-Dirichlet model with hyperprior distribution on the concentration parameters

I will try to describe the problem at hand as general as possible. I am modeling observations as a categorical distribution with a parameter probability vector theta. Then, I assume the parameter vector theta follows a Dirichlet prior distribution…
15
votes
2 answers

Differences between prior distribution and prior predictive distribution?

While studying Bayesian statistics, somehow I am facing a problem to understand the differences between prior distribution and prior predictive distribution. Prior distribution is sort of fine to understand but I have found it vague to understand…
14
votes
1 answer

Hyperprior density for hierarchical Gamma-Poisson model

In a hierarchical model of data $y$ where $$y \sim \textrm{Poisson}(\lambda)$$ $$\lambda \sim \textrm{Gamma}(\alpha, \beta)$$ it appears to be typical in practice to chose values ($\alpha, \beta)$ such that the mean and variance of the gamma…
14
votes
1 answer

Why does adding a lag effect increase mean deviance in a Bayesian hierarchical model?

Background: I'm currently doing some work comparing various Bayesian hierarchical models. The data $y_{ij}$ are numeric measures of well-being for participant $i$ and time $j$. I have around 1000 participants and 5 to 10 observations per…
Jeromy Anglim
  • 42,044
  • 23
  • 146
  • 250
14
votes
2 answers

What are the parameters of a Wishart-Wishart posterior?

When infering the precision matrix $\boldsymbol{\Lambda}$ of a normal distribution used to generate $N$ D-dimensional vectors $\mathbf{x_1},..,\mathbf{x_N}$ \begin{align} \mathbf{x_i} &\sim \mathcal{N}(\boldsymbol{\mu, \Lambda^{-1}})…
12
votes
5 answers

What precisely does it mean to borrow information?

I often people them talk about information borrowing or information sharing in Bayesian hierarchical models. I can't seem to get a straight answer about what this actually means and if it is unique to Bayesian hierarchical models. I sort of get the…
12
votes
1 answer

Why does the redundant mean parameterization speed up Gibbs MCMC?

In Gelman & Hill (2007)'s book (Data Analysis Using Regression and Multilevel/Hierarchical Models), the authors claim that including redundant mean parameters can help speed up MCMC. The given example is a non-nested model of "flight simulator" (Eq…
Heisenberg
  • 4,239
  • 3
  • 23
  • 54
11
votes
2 answers

What is a good analogy to illustrate the strengths of Hierarchical Bayesian Models?

I'm relatively new to bayesian statistics and have been using JAGS recently to build hierarchical bayesian models on different datasets. While I'm very satisfied of the results (compared to standard glm models), I need to explain to…
nassimhddd
  • 363
  • 1
  • 10
1
2 3
41 42