I know that a beta distribution with unknown parameters a,b has a 95% HPD of [0.25, 0.75]. What is the correct approach to solve for a,b?
-
1First argue that $a=b,$ then find the (unique) solution numerically. – whuber Jul 17 '21 at 18:54
-
1I think you have to solve the following integral $\int_{0.25}^{0.75}Beta(x;a,b)= 0.95$ – Fiodor1234 Jul 17 '21 at 22:09
-
@Fiodor1234 That gets you part of the way there. However, there is a one-dimensional manifold of solutions and only one of them has $(1/4,3/4)$ for its HPD interval. – whuber Jul 18 '21 at 17:53
-
@whuber as $(1/4,3/4)$ you refer to the parameters $a,b$? However, what I was thinking if you try to solve the integral I mentioned, how you are supposed to solve it? Because I've read there is no closed form solution – Fiodor1234 Jul 18 '21 at 21:09
-
1@Fiodor Many people would want to compute the solution directly by finding the zero numerically, as I wrote in my first comment. – whuber Jul 19 '21 at 12:32
1 Answers
Let's characterize a highest posterior density region. My analysis of the general question (of finding a highest posterior density $f$) at https://stats.stackexchange.com/a/383626/919 shows that $f(1/4)=f(3/4).$ Since the Beta density with parameters $\alpha,\beta$ at a number $0\lt x \lt 1$ is
$$f(x;\alpha,\beta) = \frac{1}{B(\alpha,\beta)} x^{\alpha-1}\,(1-x)^{\beta-1},$$
by plugging in $x=1/4$ and $x=3/4$ separately and equating we find
$$\frac{1}{B(\alpha,\beta)}\left(\frac{1}{4}\right)^{\alpha-1}\left(\frac{3}{4}\right)^{\beta-1} = f\left(\frac{1}{4};\alpha,\beta\right)=\frac{1}{B(\beta,\alpha)}\left(\frac{3}{4}\right)^{\alpha-1}\left(\frac{1}{4}\right)^{\beta-1}.$$
Since $B(\alpha,\beta)=B(\beta,\alpha),$ we may clear the denominators and derive
$$3^{\beta-1} = 3^{\alpha-1},$$
whence $\alpha=\beta.$
With this result, let $F(x,\alpha)$ be the cumulative density function (CDF) of a Beta$(\alpha,\alpha)$ distribution at the value $0\lt x \lt 1.$ Because its density is symmetric (specifically, $f(x,\alpha,\alpha) = f(1-x,\alpha,\alpha)$ for all $x$), $F$ satisfies
$$F(1-x,\alpha) = 1 - F(x,\alpha)$$
for all $x.$
The other fact postulated in the question is that the total probability of the interval $[1/4,3/4]$ is $95\%.$ In terms of $F$ this means
$$\frac{95}{100} = 95\% = F\left(\frac{3}{4},\alpha\right) = F\left(\frac{1}{4},\alpha\right) = 1 - 2F\left(\frac{1}{4},\alpha\right).$$
In other words,
$\alpha$ must be a zero of the function $$h(\alpha) = F\left(\frac{1}{4},\alpha\right) - \left(1-\frac{95}{100}\right)/\,2.$$
This requires numerical solutions. Use your favorite root finder.
It's not hard to show that this root is a strictly increasing function of the value $p=1/4,$ where the original endpoints of the highest posterior density are placed symmetrically at $p$ and $1-p$ (with $0 \lt p \lt 1/2$). It's nice to know this, because it justifies the following comments, but proving it would take us a little far afield in this post.
Finding zeros can be tricky. It helps when you can find a reasonable initial estimate to get the search started. In this case, since for $\alpha \gg 1$ the Beta distribution is approximately Normal, we can match moments and replace $F$ by the CDF of a Normal distribution of mean $1/2$ and variance $1/(4(2\alpha+1)).$ Such a CDF is readily expressed in terms of the CDF of the standard Normal distribution, $\Phi.$ The resulting equation is
$$0 \approx \Phi\left(\frac{1/4 - 1/2}{\sqrt{1/(4(2\alpha+1))}}\right) - \left(1-\frac{95}{100}\right)/2$$ whose solution can be expressed in terms of the Normal quantile function $\Phi^{-1}$ as
$$Z = \Phi^{-1}\left(\left(1 - \frac{95}{100}\right)/\,2\right) \approx -1.95996;$$
$$\alpha = \left(\left(\frac{Z}{2(1/4 - 1/2)}\right)^2 - 1\right)/\,2 \approx7.1829.$$
Because these Beta distributions have negative excess kurtosis, this will be an upper estimate. It is readily polished with any root-finding algorithm. In a few steps it will yield the value $\alpha\approx 6.9159.$
The gray area (beneath the Beta density) is exactly $95\%.$ The approximate Normal density (having the same mean and variance) is shown in dotted red.
Finally, I have described this solution in such a way that it can readily be adapted to variants of the problem where the interval $[1/4,3/4]$ is replaced by any symmetric interval $[p,1-p]$ and $95\%$ can be any value between $0$ and $100\%.$

- 281,159
- 54
- 637
- 1,101