12

In Bayesian models, can you use Uniform(-inf, inf) as a prior?

I ask because in an class, we looked at MH MCMC sampler, and showed that to sample from a distribution, we need not explicitly solve for the denominator because the numerator is proportional to the the posterior, which will inform the sampler where it should spend more/less of it's time, so you really only need be concerned with the prior and likelihood terms.

I asked the question, "what if the prior term in the numerator was just multiplying the likelihood by 1?" To which my professor said, "this would be analogous to specifying a Uniform prior with support from negative infinity to positive infinity, as there are no range limitations and every value that the parameter could take on would be weighted the same."

First, I'm not sure whether this is or isn't okay. And second, I've heard that 'there really aren't any uninformative priors' though this sounds about as uninformative as a prior could get.

Could someone clarify?

Henry
  • 30,848
  • 1
  • 63
  • 107
jbuddy_13
  • 1,578
  • 3
  • 22
  • 3
    Note that in general, there is no uniform distribution that can cover the whole real line or the positive real line. – Peter O. Dec 19 '21 at 22:24
  • Does this answer your question? [What is an "uninformative prior"? Can we ever have one with truly no information?](https://stats.stackexchange.com/questions/20520/what-is-an-uninformative-prior-can-we-ever-have-one-with-truly-no-information) – Hong Ooi Dec 20 '21 at 01:57

2 Answers2

14

enter link description hereenter image description here

On this forum, there are a lot of related questions and answers about flat priors. They are not uniform priors because they are not distributions but $\sigma$-finite measures (with infinite mass) and they are not the most uninformative or non-informative priors for many reasons detailed in these answers (and Bayesian textbooks). If the posterior attached to the likelihood $f(x|\theta)$ and a flat (constant) prior $\pi(\theta)=c$ is well-defined, ie can be normalised into a probability density for almost all realisations of the random variable $X$ behind the observed data, $$\int_\Theta f(x|\theta)~\text d\theta < \infty\qquad\forall x\quad\text{a.s.}$$ then using this extension of the standard Bayesian framework is acceptable.

Note: the question is unrelated to MCMC (although one should not use MCMC with an improper posterior). The proper entry keyword is improper priors which is a section or a chapter of all Bayesian textbooks. Improper priors are $\sigma$-finite measures $\pi(\cdot)$ (with infinite mass) that can be used as prior measures provided $$\int_\Theta f(x|\theta) \pi(~\text d\theta) < \infty\qquad\forall x\quad\text{a.s.}$$ A flat prior (over an unbounded space) is a particular case of improper prior but not a very special one since a flat prior does not stay constant under most reparameterisations (changes of variables).

Xi'an
  • 90,397
  • 9
  • 157
  • 575
  • 1
    So if I understand, it’s valid to use the Uniform dist when the range is finite (as it will integrate to 1) but when the range spans -inf to inf, the mass will be infinite, so thats not valid. Right? – jbuddy_13 Dec 19 '21 at 00:11
  • 1
    Might be worth linking to https://stats.stackexchange.com/questions/20520/what-is-an-uninformative-prior-can-we-ever-have-one-with-truly-no-information which you illustrate at the top – Henry Dec 19 '21 at 13:08
  • 1
    Also worth noting that "improper" in improper prior just means that it does not integrate to 1, not that it is improper to use in any analysis context. – bdeonovic Dec 19 '21 at 18:23
1

If parameter estimation is all you're interested in, then yes, you can use (an improper prior such as) a uniform prior, and Jeffreys (1961, Theory of probability, Clarendon Press) frequently does so in that context. However, any model with an improper prior over a parameter will have a big problem when it comes to model comparison: computing the marginal likelihood therein involves dividing through by the infinitely large partition function of the improper prior. As a result, the marginal likelihood in, and therefore the posterior probability of, the model with the improper prior will be smaller by an infinite factor than the marginal likelihood in, and posterior probability of, any model with a proper prior.

Daniel Hatton
  • 226
  • 1
  • 3
  • 1
    [Jeffreys was also using improper priors for model comparison](https://projecteuclid.org/journals/statistical-science/volume-24/issue-2/Harold-Jeffreyss-Theory-of-Probability-Revisited/10.1214/09-STS284.full), albeit on the nuisance parameters, – Xi'an Dec 19 '21 at 21:53