5

I want to use MH to get samples from $p(\theta \mid y) \approx p(y \mid \theta) p(\theta)$. Let's assume $\theta$ is heavily constrained and I transform $\theta$ to $f(\theta)$ so I can sample from an unconstrained space.

The new posterior becomes $p(f(\theta) \mid y) \approx p(y \mid f(\theta) ) \ p(f(\theta)) \,\times\, |\det(J_{f^{-1}}(y)) |$. Note that I only changed the prior term (Pushforward measure) and left the likelihood term unchanged as it is a probability distribution on $y$, not on $\theta$.

(1) My question now is: can I - in the Metropolis Hastings acceptance ratio - just evaluate

$$\frac{p(y \mid \theta^\star) }{ p(y \mid \theta) } \,\times\, \frac{p(f(\theta^\star)) \mid \det{ J_{f^{-1}}( \theta^\star)} \mid }{ p(f(\theta)) \mid \det{ J_{f^{-1}}( \theta )} \mid }$$

? This term makes me nervous, because I transformed theta, evaluate the pdf of the transformed prior, but then transform it back and evaluate the likelihood of the parameter in the original space. However, I cannot evaluate the first term of this equation:

$$\frac{p(y \mid f(\theta^\star)) }{ p(y \mid f(\theta)) } \,\times\, \frac{p(f(\theta^\star)) \left\lvert \det{ J_{f^{-1}}( \theta^\star)} \right\rvert }{ p(f(\theta)) \left\lvert \det{ J_{f^{-1}}( \theta )} \right\rvert }.$$

I could somehow reverse engineer this problem, i.e. define priors on $f(\theta)$ and then map $f(\theta)$ to $\theta$. The Jacobian of the inverse transform then becomes the Jacobian of the transform of my original problem. That way I could evaluate all terms. However, I originally wanted to give some meaning to my priors for $\theta$, not for some unconstrained $f(\theta)$.

EDIT: Problem solved and clarified - thank you, I should have seen this myself! Please also see linked stackexchange thread to this post for further clarification.

Xi'an
  • 90,397
  • 9
  • 157
  • 575

1 Answers1

5

You should notice that what you denote $p(y|f(\theta))$ is actually the same as $p(y|\theta)$ [if you overlook the terrible abuse of notations]. As you mention, changing the parameterisation does not modify the density of the random variable at the observed value $y$ and there is no Jacobian associated with that part.

With proper notations, if \begin{align*} \theta &\sim \pi(\theta)\qquad\qquad&\text{prior}\\ y|\theta &\sim f(y|\theta)\qquad\qquad&\text{sampling}\\ \xi &= h(\theta) \qquad\qquad&\text{reparameterisation}\\ \dfrac{\text{d}\theta}{\text{d}\xi}(\xi) &= J(\xi)\qquad\qquad&\text{Jacobian}\\ y|\xi &\sim g(y|\xi)\qquad\qquad&\text{reparameterised density}\\ \xi^{(t+1)}|\xi^{(t)} &\sim q(\xi^{(t+1)}|\xi^{(t)}) \qquad\qquad&\text{proposal} \end{align*} the Metropolis-Hastings ratio associated with the proposal $\xi'\sim q(\xi'|\xi)$ in the $\xi$ parameterisation is $$ \underbrace{\dfrac{\pi(\theta(\xi'))J(\xi')}{\pi(\theta(\xi))J(\xi)}}_\text{ratio of priors}\times \underbrace{\dfrac{f(y|\theta(\xi')}{f(y|\theta(\xi))}}_\text{likelihood ratio}\times \underbrace{\dfrac{q(\xi|\xi')}{q(\xi'|\xi)}}_\text{proposal ratio} $$ which also writes as $$\dfrac{\pi(h^{-1}(\xi'))J(\xi')}{\pi(h^{-1}(\xi))J(\xi)}\times \dfrac{g(y|\xi')}{g(y|\xi)}\times \dfrac{q(\xi|xi')}{q(\xi'|xi)} $$

Xi'an
  • 90,397
  • 9
  • 157
  • 575