Picking noninformative priors using pivotal quantities

Question

In 'Bayesian Data Analysis' (Gelman, Carlin, Stern and Rubin) on page 64 it reads:

"If the density of $y$ is such that $p(y-\theta|\theta)$ is a function that is free of $\theta$ and $y$, say $f(u)$ where $u = y - \theta$, then $y - \theta$ is a pivotal quantity, and $\theta$ is called a pure location parameter. In such a case, it is reasonable that a noninformative prior distribution for $\theta$ would give $f(y - \theta)$ for the posterior ditsribution, $p(y - \theta|y)$. That is, under the posterior distribution, $y-\theta$ should still be a pivotal quantity, whose distribution is free of both $\theta$ and $y$. Under this condition, using Bayes' rule, $p(y - \theta|y) \propto p(\theta)p(y - \theta|\theta)$..."

Maybe I'm being dense, but shouldn't Bayes' rule say something like $p(y - \theta|y) \propto p(y - \theta)p(y|y-\theta)$? What am I forgetting to remember here?

$p(y - \theta|y) = p(y - \theta,y)/p(y) \propto p(y - \theta,y) \propto p(y , \theta) \propto p(y - \theta,\theta)\propto p(\theta)p(y - \theta|\theta)$ -- unless I did something silly there, I guess it would go something like that. — Glen_b, Oct 03 '14 at 03:18
I guess I'm confused as to why $p(y-\theta, y) \propto p(y,\theta)$. I see how the Jacobian is one, but I don't see how that's the same thing. I'm running through some examples in my head...like if $y,\theta$ were two independent standard normals, and when you transformed them it would correlate them. Off the top of my head I can't see how that's proportional in two variables. — Taylor, Oct 03 '14 at 17:17
Since the Jacobian is 1 the intuition should be a bit easier to generate. Think about being in a small region near $(y,\theta)$, say of size $dy\, d\theta$. Whether you define where you are in terms of $(y-\theta,y)$ or $(y,\theta)$ or $(y-\theta,\theta)$, the probability you're in that little region will be the same. — Glen_b, Oct 04 '14 at 01:39
I guess it's just notational confusion. Usually when I think of change of variables, I rename the new stuff. If you don't rename things, and you don't have to worry about a Jacobian, it's exactly the same density when you write it down. I'm not crazy, though right? Writing the same density down as a density for something else. A bit sloppy I'd say. But thanks though @Glen_b — Taylor, Oct 06 '14 at 22:45
I'm also having some trouble here. $p(y-\theta|\theta)$ is the density with parameter $\theta$ evaluated at point $y-\theta$, right? Say, a normal density with $\sigma=1$ and mean $\mu$, $f(y-\mu|\mu)$. I do not see how $f(y-\mu|\mu)\propto \exp(-1/2((y-\mu)-\mu)^2)$ is "a function that is free of $\mu$ and $y$". — Christoph Hanck, Jan 08 '16 at 14:44
No I don't think so. $p(y-\theta|\theta)$ is a density of the new random variable $y-\theta$ (let's call it $Z$). Then by the transformation theorem $p(y-\theta|\theta) = p_y(z+\theta|\theta)|1|$. Plug that in and you can see how the density is free of any location parameter $\mu$ or $\theta$. — Taylor, Jan 09 '16 at 16:11
$p_\theta(\theta|y) \propto p_Y(y|\theta) p(\theta) = p_Z(z|\theta)p(\theta)$ where the 'proportional symbol' is from Bayes' rule, and the equality is because we're assuming $\theta$ is a pure location parameter for $y$. Then we're done after we see that $p_{\theta}(\theta|y) = p_{y-\theta}(y-\theta|y)$ because $y$ is a pure location parameter for $\theta$, which they say is reasonable to assume. — Taylor, Jan 09 '16 at 16:31
I see, thanks - their notation is pretty "compact", I would say :-). — Christoph Hanck, Jan 14 '16 at 13:49

Taylor · Accepted Answer · 2018-11-08T01:56:06.123

They're both true. To justify the way that the book mentions, it might help to use slightly different notation.

If $\theta$ is a location parameter, then set $U = Y-\theta$ and observe that $$ p_{U \mid Y}(u \mid y) \propto p_{U,Y}(u,y) = p_{U,\theta}(u, \theta)|1| = p_{U \mid\theta}(u \mid \theta) p_{\theta}(\theta). $$ We can use the $\propto$ sign because the normalizing constant is free of $\theta$. The first equality is by the transformation theorem, where we are transforming $(\theta,u) \mapsto (y,u)$. The last property is just the definition of a conditional density.

Stumbling on the same paragraph more than $4$ years later...

Picking noninformative priors using pivotal quantities

1 Answers1