7

Given a proportion and its standard error, what distributional assumption minimizes assumptions/maximizes entropy? Is it the beta (and can I use the method of moments to estimate its parameters)? Or something else?

generic_user
  • 11,981
  • 8
  • 40
  • 63

1 Answers1

4

It's a truncated Normal distribution. This is a consequence of Boltzmann's Theorem.


The following analysis provides the details needed to implement a practical solution.

A Normal$(\mu,\sigma)$ distribution $F$ truncated to the interval $[0,1]$ arises by taking a standard Normal variable $X$ with probability distribution $\Phi$, scaling it by $\sigma$, shifting it to $\mu$, and truncating it to $[0,1]$. Equivalently--working backwards--the original variable $X$ must have been truncated to the interval $[-\mu/\sigma, (1-\mu)/\sigma]$ where it had a total probability of

$$C = \Phi\left(\frac{1-\mu}{\sigma}\right) - \Phi\left(\frac{-\mu}{\sigma}\right),\tag{1}$$

expectation

$$\mu_1=\frac{1}{C\sqrt{2\pi}}\int_\frac{-\mu}{\sigma}^\frac{1-\mu}{\sigma} x\exp\left(\frac{-x^2}{2}\right)\mathrm{d}x,$$

and second (raw) moment

$$\mu_2 = \frac{1}{C\sqrt{2\pi}}\int_\frac{-\mu}{\sigma}^\frac{1-\mu}{\sigma} x^2\exp\left(\frac{-x^2}{2}\right)\mathrm{d}x.$$

Presumably your "standard error" is either $\sqrt{\mu_2-\mu_1^2}$ or some constant multiple of it.

These integrals can be computed in terms of

$$\mu_1(z) = \frac{1}{C\sqrt{2\pi}}\int_{-\infty}^z x\exp\left(\frac{-x^2}{2}\right)\mathrm{d}x = -\frac{1}{C\sqrt{2\pi}}\exp\left(-\frac{z^2}{2}\right)\tag{2}$$

and, integrating by parts,

$$\eqalign{ \mu_2(z) &= \frac{1}{C\sqrt{2\pi}}\int_{-\infty}^z (x)\left(x\exp\left(\frac{-x^2}{2}\right)\right)\mathrm{d}x \\ &= \frac{1}{C\sqrt{2\pi}}\left(x\left(-\exp\left(-\frac{x^2}{2}\right)\right)\mid_{-\infty}^z - \int_{-\infty}^z -\exp\left(-\frac{x^2}{2}\right)\mathrm{d}x \right)\\ &=-\frac{1}{C\sqrt{2\pi}}z\exp\left(-\frac{z^2}{2}\right) + \frac{1}{C}\Phi(z)\tag{3}. }$$

Thus

$$\mu_1 = \mu_1\left(\frac{1-\mu}{\sigma}\right) - \mu_1\left(\frac{-\mu}{\sigma}\right)$$

and

$$\mu_2 = \mu_2\left(\frac{1-\mu}{\sigma}\right) - \mu_2\left(\frac{-\mu}{\sigma}\right).$$

These calculations $(1)$, $(2)$, and $(3)$ can be implemented in any software where exponentials, square roots, and $\Phi$ are available. This permits application in any fitting procedure, such as method of moments or maximum likelihood. Either would require numerical solutions.

whuber
  • 281,159
  • 54
  • 637
  • 1,101