14

If I use a Jeffreys prior for a binomial probability parameter $\theta$ then this implies using a $\theta \sim beta(1/2,1/2)$ distribution.

If I transform to a new frame of reference $\phi = \theta^2$ then clearly $\phi$ is not also distributed as a $beta(1/2,1/2)$ distribution.

My question is in what sense is Jeffreys prior invariant to reparameterisations? I think I am misunderstanding the topic to be honest ...

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
ben18785
  • 728
  • 1
  • 6
  • 18
  • 8
    Jeffreys' prior is invariant in the sense that starting with a Jeffreys prior for one parameterisation and running the appropriate change of variable is identical to deriving the Jeffreys prior directly for this new parameterisation. Actually, _equivariant_ would be more appropriate a term than _invariant_. – Xi'an Apr 24 '17 at 20:47
  • @ben18785: take a look at https://stats.stackexchange.com/questions/38962/why-is-the-jeffreys-prior-useful – Zen Apr 24 '17 at 21:48
  • See also https://math.stackexchange.com/questions/210607/in-what-sense-is-the-jeffreys-prior-invariant (more or less the same question I think, but on a different site). – N. Virgo Apr 24 '17 at 23:16
  • See also https://stats.stackexchange.com/questions/139001/example-for-a-prior-that-unlike-jeffreys-leads-to-a-posterior-that-is-not-inva – Christoph Hanck Mar 06 '19 at 07:49

1 Answers1

22

Lets have $\phi = g(\theta)$, where $g$ is a monotone function of $\theta$ and let $h$ be the inverse of $g$, so that $\theta = h(\phi)$. We can obtain Jeffrey's prior distribution $p_{J}(\phi)$ in two ways:

  1. Start with the Binomial model (1) \begin{equation} \label{original} p(y | \theta) = \binom{n}{y} \theta^{y} (1-\theta)^{n-y} \end{equation} reparameterize the model with $\phi = g(\theta)$ to get $$ p(y | \phi) = \binom{n}{y} h(\phi)^{y} (1-h(\phi))^{n-y} $$ and obtain Jeffrey's prior distribution $p_{J}(\phi)$ for this model.
  2. Obtain Jeffrey's prior distribution $p_{J}(\theta)$ from original Binomial model 1 and apply the change of variables formula to obtain the induced prior density on $\phi$ $$ p_{J}(\phi) = p_{J}(h(\phi)) |\frac{dh}{d\phi}|. $$

To be invariant to reparameterisations means that densities $p_{J}(\phi)$ derived in both ways should be the same. Jeffrey's prior has this characteristic [Reference: A First Course in Bayesian Statistical Methods by P. Hoff.]

To answer your comment. To obtain Jeffrey's prior distribution $p_{J}(\theta)$ from the likelihood for Binomial model $$ p(y | \theta) = \binom{n}{y} \theta^{y} (1-\theta)^{n-y} $$ we must calculate Fisher information by taking logarithm of likelihood $l$ and calculate second derivative of $l$ \begin{align*} l := \log(p(y | \theta)) &\propto y \log(\theta) + (n-y) \log(1-\theta) \\ \frac{\partial l }{\partial \theta} &= \frac{y}{\theta} - \frac{n-y}{1-\theta} \\ \frac{\partial^{2} l }{\partial \theta^{2}} &= -\frac{y}{\theta^{2}} - \frac{n-y}{ (1-\theta)^{2} } \end{align*} and Fisher information is \begin{align*} I(\theta) &= -E(\frac{\partial^{2} l }{\partial \theta^{2}} | \theta) \\ &= \frac{n\theta}{\theta^{2}} + \frac{n - n \theta}{(1-\theta)^{2}} \\ &= \frac{n}{\theta ( 1- \theta)} \\ &\propto \theta^{-1} (1-\theta)^{-1}. \end{align*} Jeffrey's prior for this model is \begin{align*} p_{J}(\theta) &= \sqrt{I(\theta)} \\ &\propto \theta^{-1/2} (1-\theta)^{-1/2} \end{align*} which is $\texttt{beta}(1/2, 1/2)$.

Ben
  • 91,027
  • 3
  • 150
  • 376
Marko Lalovic
  • 501
  • 4
  • 9
  • 1
    Thanks for your answer. Afraid I am being a bit slow though. In what sense can we obtain a prior from a likelihood? They are two separate things, and the latter does not imply the former... – ben18785 Apr 24 '17 at 20:25
  • 4
    I answered above by obtaining a Jeffrey's prior $p_{J}(\theta)$ from the likelihood for Binomial model. – Marko Lalovic Apr 24 '17 at 20:36