13

How should one go about turning a frequentist result into a Bayesian prior?

Consider the following pretty generic scenario: An experiment was conducted in the past and a result on some parameter $\phi$ was measured. The analysis was done with a frequentist methodology. A confidence interval for $\phi$ is given in the results.

I'm now conducting some new experiment where I want to measure some other parameters, say both $\theta$ and $\phi$. My experiment is different than the previous study --- it is not performed with the same methodology. I would like to do a Bayesian analysis, and so I will need to place priors on $\theta$ and $\phi$.

No previous measurements of $\theta$ have been performed, so I place a uninformative (say its uniform) prior on it.

As mentioned, there is a previous result for $\phi$, given as a confidence interval. To use that result in my current analysis, I would need to translate the previous frequentist result into an informative prior for my analysis.

One option that is unavailable in this made up scenario is to repeat the previous analysis that led to the $\phi$ measurement in a Bayesian fashion. If I could do this, $\phi$ would have a posterior from the previous experiment that I would then use as my prior, and there would be no issue.

How should I translate the frequentist CI into a Bayesian prior distribution for my analysis? Or in other words, how could I translate their frequentest result on $\phi$ into a posterior on $\phi$ that I would then use as a prior in my analysis?

Any insights or references that discuss this type of issue are welcome.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
bill_e
  • 2,681
  • 1
  • 19
  • 33
  • Prior, or posterior distribution? – Tim Jun 06 '16 at 19:09
  • edited for clarity, better? – bill_e Jun 06 '16 at 19:39
  • Can you have a uniform from -infinity to +infinity – mdewey Jun 06 '16 at 19:42
  • 1
    Not sure what this has to do with meta-analysis. Can you clarify – mdewey Jun 06 '16 at 19:42
  • 3
    You're looking for matching priors, Welch and Peers style. Take a look at this review: http://projecteuclid.org/euclid.lnms/1215091929 – Zen Jun 06 '16 at 19:44
  • There is a lot of meta-information to take into account here. Unless you are measuring some constant of nature, I've never seen results of an experiment translate perfectly to a different situation, even though it should be identical on paper. In the past, I have used previously reported data to set a weakly or at best mildly informative prior. Independently of what method you choose, I would recommend to make the prior heavy-tailed to account for deviations from your assumptions. – lacerbi Jun 06 '16 at 19:48
  • @lacerbi, the situation i was thinking of was actually about physical constant. I figured I would leave the question more general since this scenario likely applies in other domains. – bill_e Jun 06 '16 at 20:02
  • @mdewey edited out the $-\infty$ to $\infty$ bit. I added the meta-analysis tag since there is an aspect of combining different analyses. I can remove if it is not appropriate. – bill_e Jun 06 '16 at 20:06
  • What is the confidence level for you confidence interval ? Moreover, is $\phi$ a location or scale parameter or sometimes else that can be used ? – peuhp Jun 13 '16 at 10:06

2 Answers2

5

Short version: Take a Gaussian centered at the previous estimate, with std. dev. equal to the CI.

Long version: Let $\phi_0$ be the true value of the parameter, and let $\hat \phi$ the estimate that you have. Assume an a priori uniform prior $P(\phi)=ct$. You want to know the distribution of $\phi_0$ given that an estimate $\hat \phi$ has already been obtained:

$$ P(\phi_0|\hat\phi)=\frac{P(\hat\phi|\phi_0)P(\phi_0)}{P(\hat \phi)}\\ =P(\hat\phi|\phi_0) \frac{ct}{P(\hat \phi)} $$ Now the only dependence on $\phi_0$ is in the term $P(\hat\phi|\phi_0)$, the rest is a normalization constant. Assuming the $\hat\phi$ is a maximum likelihood estimator (or some other consistent estimator), we can use the following facts:

  1. As the number of observations increases, the MLE is asymptotically Gaussian,
  2. It is asymptotically unbiased (centered at the true value $\phi_0$),
  3. It fluctuates around $\phi_0$ with variance equal to the inverse Fisher Information of the prior observations, and that is what I would have used as CI (squared).

Another way to put it: The Bayesian posterior and the distribution of a consistent and efficient estimator become asymptotically the same.

Alex Monras
  • 401
  • 3
  • 8
  • I should add that this solution is for 68% CI, which is 1 sigma. If your confidence intervals are 95% you are at two sigmas, so you should divide the CI by 2, if they are at 99.7%, then they are 3 sigmas, so you should divide by 3. https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule – Alex Monras Jun 20 '16 at 10:47
  • I was to comment precisely what is in your comment :-) Maybe you should add that to your reply. I would... – Rolazaro Azeveires Jun 25 '17 at 21:06
1

It depends. In some simple cases with normally distributed data, when you have frequentist confidence interval based on the $t$-distribution, the corresponding marginal posterior from a Bayesian analysis would be a shifted, rescaled student $t$-distribution with quantiles matching the frequentist confidence limits, see https://en.wikipedia.org/wiki/Student%27s_t-distribution#Bayesian_inference. Similarly, if you have frequentist confidence interval for some variance parameter $\sigma^2$ derived via the chi-square distribution of some pivotal quantity like $S^2(n-p)/\sigma^2$, the corresponding Bayesian marginal posterior would be an "inverse rescaled chi-square" (an inverse Gamma distribution), again with quantiles matching the frequentist confidence limits (provided that the a non-informative scale prior $\propto 1/\sigma^2$ is used).

Jarle Tufto
  • 7,989
  • 1
  • 20
  • 36