I know that the beta distribution is conjugate to the binomial. But what is the conjugate prior of the beta? Thank you.
-
If one works with one parameter fixed, say Beta$(\theta,1)$, then a conjugate prior for $\theta$ is a Gamma distribution. – StubbornAtom Apr 14 '20 at 18:45
-
Also see the question [Random number generation for conjugate distribution of beta distribution](https://stats.stackexchange.com/questions/485170/random-number-generation-for-conjugate-distribution-of-beta-distribution). – Mankka Nov 05 '20 at 14:00
5 Answers
Yes, it has a conjugate prior in the exponential family. Consider the three parameter family $$ \pi(\alpha, \beta \mid a, b, p) \propto \left\{\frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha)\Gamma(\beta)}\right\}^p \exp\left(a\alpha + b\beta \right). $$ For some values of $(a, b, p)$ this is integrable, although I haven't quite figured out which (I believe $p \ge 0$ and $a < 0, b < 0$ should work - $p = 0$ corresponds to independent exponential distributions so that definitely works, and the conjugate update involves incrementing $p$ so this suggest $p > 0$ works as well).
The problem, and at least part of the reason no one uses it, is that $$ \int_0^\infty \int_0^\infty \left\{\frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha)\Gamma(\beta)}\right\}^p \exp\left(a\alpha + b\beta \right) = ? $$ i.e. the normalizing constant doesn't have a cloed form.

- 7,737
- 1
- 26
- 50
-
Ah. That is problematic. I was going to look for an uninformative version of the conjugate prior anyway, so looks like I might as well start with uniform priors over the two parameters. Thanks. – Brash Equilibrium Aug 15 '13 at 20:42
-
1
-
I think you might be missing the action of $p$ in your $\exp$ term as well. It should probably be $pa\alpha$, etc. – Neil G Aug 17 '13 at 23:38
-
@NeilG $p$ is in the $\exp$, you just have to express things in terms of $\log \Gamma(\cdots)$ instead of $\Gamma(\cdots)$. Doing $pa\alpha$ is just a reparmetrization, it changes nothing. Not sure what you mean "just comparing likelihoods". You can't implement a Gibbs sampler with this prior without using something like Metropolis, which kills the advantage of conditional conjugacy, the normalizing constant depends on $a$ and $b$ which kills putting a prior on them or estimating them by likelihood methods, etc... – guy Aug 17 '13 at 23:44
-
You're right that putting $p$ there is a reparameterization. However, should that be a triple integral, since you want to integrate over all three parameters… and $a$ and $b$ as you have them should be negative so two of the integrals should run over the negative reals. – Neil G Aug 17 '13 at 23:47
-
2@NeilG integral is over $\alpha$ and $\beta$ since those are the random variables. – guy Aug 18 '13 at 01:59
-
-
Does this type of prior assume $\alpha$ and $\beta$ are independent? What if they are not? – Jon May 28 '17 at 21:38
It seems that you already gave up on conjugacy. Just for the record, one thing that I've seen people doing (but don't remember exactly where, sorry) is a reparameterization like this. If $X_1,\dots,X_n$ are conditionally iid, given $\alpha,\beta$, such that $X_i\mid\alpha,\beta\sim\mathrm{Beta}(\alpha,\beta)$, remember that $$ \mathbb{E}[X_i\mid\alpha,\beta]=\frac{\alpha}{\alpha+\beta} =: \mu $$ and $$ \mathbb{Var}[X_i\mid\alpha,\beta] = \frac{\alpha\beta}{(\alpha+\beta)^2(\alpha+\beta+1)} =: \sigma^2 \, . $$ Hence, you may reparameterize the likelihood in terms of $\mu$ and $\sigma^2$ and use as a prior $$ \sigma^2\mid\mu \sim \mathrm{U}[0,\mu(1-\mu)] \qquad \qquad \mu\sim\mathrm{U}[0,1] \, . $$ Now you're ready to compute the posterior and explore it by your favorite computational method.
-
5No, not MCMC this thing! Quadrature this thing! only 2 parameters - quadrature is the "gold standard" for small dimensional posteriors, both for time and accuracy. – probabilityislogic Aug 18 '13 at 22:47
-
4Another option is to regard $\psi = \alpha + \beta$ as a measure of precision, and again use $\mu = \frac{\alpha}{\alpha + \beta}$ as an the mean. This is done all the time with Dirichlet processes, and the beta distribution is a special case. So maybe toss a gamma or log-normal prior on $\psi$ and uniform on $\mu$. – guy Aug 18 '13 at 23:34
-
3
-
3
-
Hi @Zen i'm dealing with this problem right now, but i'm new at Bayesian and im not sure if im understanding the idea. I figured out that you're proposing to find $\int_{0}^{1}{\frac{1}{\mu(1-\mu}}d\mu$ and then use reparametrization, but of course that Was not the idea. Can you please help me to understand? – Red Noise Mar 12 '17 at 15:54
-
Another option is to parameterize by "concentration", $c = \sqrt{\alpha^2 + \beta^2}$, which scales independent of the mean. [Velten et al., 2015](http://msb.embopress.org/content/11/6/812) use this with uniform prior on the mean and log-normal prior on $c$. Details are in [their supplement (PDF)](http://msb.embopress.org/content/msb/11/6/812/DC1/embed/inline-supplementary-material-1.pdf?download=true). – merv Jul 06 '18 at 22:30
-
@Zen can you clarify what is the purpose of this reparameterization? How is this more helpful than working in the original parameterization? – Ceph Nov 02 '18 at 19:34
-
@probabilityislogic Does quadrature mean numerical integration, or something different (or perhaps something more specific)? – ashman May 15 '21 at 21:14
In theory there should be a conjugate prior for the beta distribution. This is because
- the beta distribution is one of the exponential family distributions, and
- in theory it should be possible to derive a prior. See, e.g., wikipedia, D Blei's lecture on exponential families.
However the derivation looks difficult, and to quote A Bouchard-Cote's Exponential Families and Conjugate Priors
An important observation to make is that this recipe does not always yields a conjugate prior that is computationally tractable.
Consistent with this, there is no prior for the Beta distribution in D Fink's A Compendium of Conjugate Priors.

- 3,631
- 24
- 33
-
4The derivation is not difficult — See my answer: http://mathoverflow.net/questions/63496/what-can-be-said-about-an-infinite-linear-chain-of-conjugate-prior-distributions/65203#65203 – Neil G Aug 17 '13 at 23:27
Robert and Casella (RC) happen to describe the family of conjugate priors of the beta distribution in Example 3.6 (p 71 - 75) of their book, Introducing Monte Carlo Methods in R, Springer, 2010. However, they quote the result without citing a source.
Added in response to gung's request for details. RC state that for distribution $B(\alpha, \beta)$, the conjugate prior is "... of the form
$$ \pi(\alpha,\beta) \propto \Big\{ \frac{\Gamma(\alpha+\beta)} {\Gamma(\alpha)\Gamma(\beta)} \Big\} ^{\lambda} x_0^{\alpha} y_0^{\beta} $$
where $\{\lambda, x_0, y_0\}$ are hyperparameters, since the posterior is then equal to
$$ \pi(\alpha,\beta \vert x) \propto \Big\{ \frac{\Gamma(\alpha+\beta)} {\Gamma(\alpha)\Gamma(\beta)} \Big\} ^{\lambda} (xx_0)^{\alpha} ((1-x)y_0)^{\beta}." $$
The remainder of the example concerns importance sampling from $\pi(\alpha,\beta \vert x)$ in order to compute the marginal likelihood of $x$.

- 49
- 2
-
3I don't have Robert's book available but the posterior is $\pi(\alpha, \beta) \propto \left( \frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha) \Gamma(\beta)} \right)^{\lambda +1} (x x_0)^{\alpha-1} \left(y_0 (1-x) \right)^{\beta-1}$. Robert also posted on this topic here http://mathoverflow.net/questions/20399/conjugate-prior-of-the-dirichlet-distribution#comment553934_20875 – Fred Schoen Jan 13 '16 at 20:22
-
2I humbly recommend that the original poster updates the post to indicate that the posterior given in the textbook is incorrect, per Fred Schoen's comment (which is easily verified). – RMurphy May 29 '17 at 21:04
I do not believe there is a "standard" (i.e., exponential family) distribution that is the conjugate prior for the beta distribution. However, if one does exist it would have to be a bivariate distribution.
-
I have no idea about this question, but I did find this handy conjugate prior map that seems to support your answer: http://www.johndcook.com/conjugate_prior_diagram.html – Justin Bozonier Aug 15 '13 at 03:10
-
1The conjugate prior is in the exponential family and has three parameters — not two. – Neil G Aug 17 '13 at 23:25
-
1@Neil, you are definitely right. I guess I should have said it would have to have at least two parameters. – Aug 18 '13 at 18:28
-
1-1: this answer is clearly wrong in the claim that "conjugate prior does not exist in the exponential family", as is demonstrated in the [answer](https://stats.stackexchange.com/a/67511/163572) above... – Jan Kukacka Apr 24 '18 at 08:34