Allow data to dictate the priors and then run the model using these priors? (e.g., data-driven priors from same data set)

Question

It is my understanding that we should not be allowing the same data set we are analyzing to drive/define what the prior distributions look like in a Bayesian analysis. Specifically, it is inappropriate to define prior distributions for a Bayesian analysis based on summary statistics from the same data set that you are then going to use the priors to help fit a model to.

Does anyone happen to know of resources that specifically discuss this as being inappropriate? I need some citations for this issue.

related: [what-methods-can-be-used-to-specify-priors-from-data](http://stats.stackexchange.com/questions/5542/what-methods-can-be-used-to-specify-priors-from-data) — David LeBauer, Sep 22 '11 at 05:11

David LeBauer · Answer 1 · 2011-09-22T14:40:23.993

Yes this is inappropriate because it uses the same data twice, leading to falsely overconfident results. This is known as 'double dipping'.

For references, I would start with Carlin and Louis (2000). Although 'double dipping' has been one of the primary critiques of Empirical Bayes, Ch. 3, in particular section 3.5, of this book describes ways to estimate appropriate confidence intervals using the EB approach.

Berger J (2006). \The Case for Objective Bayesian Analysis." Bayesian Analysis, 1(3), 385{ 402

Bradley P. Carlin, Thomas A. Louis 2000. Bayes and Empirical Bayes methods for data analysis.

Darniede, W.F. 2011. Bayesian Methods for Data-Dependent Priors. MS Thesis, Ohio State Univ.

Gelman, A., Carlin, J. B., Stern, H. S., and Rubin, D. B. (2003), Bayesian Data Anal- ysis, Second Edition (Chapman & Hall/CRC Texts in Statistical Science), Chap- man and Hall/CRC, 2nd ed.

@sarah Please register your account so you could reclaim your question. Just visit this url: http://stats.stackexchange.com/users/login — , Sep 22 '11 at 19:19

score 1 · Answer 2 · answered Sep 27 '11 at 19:26

1

It can make sense to use the data to build the prior though.

For an example in mixture modelling, see Richardson & Green (1997): http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.27.3667

They use the mean and the range of the data points as hyperparameters for the prior and it makes perfect sense.

The problem of using the data twice occurs when a informative prior is derived from the data, in my opinion.

As long as you check that your prior distribution is "flat" where the posterior distribution is peaked, then you know that your prior distribution has not a strong impact on the results.

answered Sep 27 '11 at 19:26

Pierre

11
1

Using the data to build the prior cannot take place within the Bayesian paradigm. So it does not make sense from a Bayesian perspective and the usual validation of Bayesian procedures does not apply. The resulting inference may be perfectly valid but one has to demonstrate it from first principles. (Richardson and Green use what is called empirical Bayes. Which is not a Bayesian procedure.) – Xi'an Jan 09 '12 at 11:20
While it does not make sense within the bayesian paradigm, sometimes the division line between what is data and what is prior is difficukt to draw. See my answer to http://stats.stackexchange.com/questions/112451/maximum-likelihood-estimation-mle-in-layman-terms – kjetil b halvorsen Sep 07 '15 at 13:01

Allow data to dictate the priors and then run the model using these priors? (e.g., data-driven priors from same data set)

2 Answers2

Linked