Hierarchical Bayesian models specify priors on parameters and hyperpriors on the parameters of the prior distributions
Questions tagged [hierarchical-bayesian]
624 questions
81
votes
2 answers
XKCD's modified Bayes theorem: actually kinda reasonable?
I know this is from a comic famous for taking advantage of certain analytical tendencies, but it actually looks kind of reasonable after a few minutes of staring. Can anyone outline for me what this "modified Bayes theorem" is doing?

eric_kernfeld
- 4,828
- 1
- 16
- 41
34
votes
2 answers
What's the difference between "deep learning" and multilevel/hierarchical modeling?
Is "deep learning" just another term for multilevel/hierarchical modeling?
I'm much more familiar with the latter than the former, but from what I can tell, the primary difference is not in their definition, but how they are used and evaluated…

user4733
- 2,494
- 2
- 20
- 31
21
votes
1 answer
Why LKJcorr is a good prior for correlation matrix?
I´m reading the chapter 13 "Adventures in Covariance" in the (superb) book Statistical Rethinking by Richard McElreath where he presents the following hierarchical model:
(R is a correlation matrix)
The author explains that LKJcorr is a weakly…

xboard
- 1,008
- 11
- 17
19
votes
2 answers
What prior distributions could/should be used for the variance in a hierarchical bayesisan model when the mean variance is of interest?
In his widely cited paper Prior distributions for variance parameters in hierarchical models (916 citation so far on Google Scholar) Gelman proposes that good non-informative prior distributions for the variance in a hierarchical Bayesian model are…

Rasmus Bååth
- 6,422
- 34
- 57
18
votes
1 answer
In Gelman's 8 school example, why is the standard error of the individual estimate assumed known?
Context:
In Gelman's 8-school example (Bayesian Data Analysis, 3rd edition, Ch 5.5) there are eight parallel experiments in 8 schools testing the effect of coaching. Each experiment yields an estimate for the effectiveness of coaching and the…

Heisenberg
- 4,239
- 3
- 23
- 54
18
votes
2 answers
What is the problem with empirical priors?
In literature I sometimes stumple upon the remark, that choosing priors that depend on the data itself (for example Zellners g-prior) can be criticized from a theoretical point of view. Where exactly is the problem if the prior is not chosen…

muffin1974
- 1,152
- 11
- 24
17
votes
2 answers
Bayesian estimation of $N$ of a binomial distribution
This question is a technical follow-up of this question.
I have trouble understanding and replicating the model presented in Raftery (1988): Inference for the binomial $N$ parameter: a hierarchical Bayes approach in WinBUGS/OpenBUGS/JAGS. It is not…

COOLSerdash
- 25,317
- 8
- 73
- 123
15
votes
3 answers
Multinomial-Dirichlet model with hyperprior distribution on the concentration parameters
I will try to describe the problem at hand as general as possible. I am modeling observations as a categorical distribution with a parameter probability vector theta.
Then, I assume the parameter vector theta follows a Dirichlet prior distribution…

Dnaiel
- 384
- 2
- 11
15
votes
2 answers
Differences between prior distribution and prior predictive distribution?
While studying Bayesian statistics, somehow I am facing a problem to understand the differences between prior distribution and prior predictive distribution. Prior distribution is sort of fine to understand but I have found it vague to understand…

StoryMay
- 2,273
- 9
- 25
14
votes
1 answer
Hyperprior density for hierarchical Gamma-Poisson model
In a hierarchical model of data $y$ where
$$y \sim \textrm{Poisson}(\lambda)$$
$$\lambda \sim \textrm{Gamma}(\alpha, \beta)$$
it appears to be typical in practice to chose values ($\alpha, \beta)$ such that the mean and variance of the gamma…

Sycorax
- 76,417
- 20
- 189
- 313
14
votes
1 answer
Why does adding a lag effect increase mean deviance in a Bayesian hierarchical model?
Background: I'm currently doing some work comparing various Bayesian hierarchical models. The data $y_{ij}$ are numeric measures of well-being for participant $i$ and time $j$. I have around 1000 participants and 5 to 10 observations per…

Jeromy Anglim
- 42,044
- 23
- 146
- 250
14
votes
2 answers
What are the parameters of a Wishart-Wishart posterior?
When infering the precision matrix $\boldsymbol{\Lambda}$ of a normal distribution used to generate $N$ D-dimensional vectors $\mathbf{x_1},..,\mathbf{x_N}$
\begin{align}
\mathbf{x_i} &\sim \mathcal{N}(\boldsymbol{\mu, \Lambda^{-1}})…

alberto
- 2,646
- 16
- 36
12
votes
5 answers
What precisely does it mean to borrow information?
I often people them talk about information borrowing or information sharing in Bayesian hierarchical models. I can't seem to get a straight answer about what this actually means and if it is unique to Bayesian hierarchical models. I sort of get the…

Eli
- 1,682
- 10
- 24
12
votes
1 answer
Why does the redundant mean parameterization speed up Gibbs MCMC?
In Gelman & Hill (2007)'s book (Data Analysis Using Regression and Multilevel/Hierarchical Models), the authors claim that including redundant mean parameters can help speed up MCMC.
The given example is a non-nested model of "flight simulator" (Eq…

Heisenberg
- 4,239
- 3
- 23
- 54
11
votes
2 answers
What is a good analogy to illustrate the strengths of Hierarchical Bayesian Models?
I'm relatively new to bayesian statistics and have been using JAGS recently to build hierarchical bayesian models on different datasets. While I'm very satisfied of the results (compared to standard glm models), I need to explain to…

nassimhddd
- 363
- 1
- 10