Background
I am generally interested in learning appropriate methods of using data to specify priors. A previous question asks how to elicit priors from experts and received some good recommendations. Here, I am interested in learning how to specify a prior using data. I plan to use these priors in a meta-analysis to synthesize additional data that I collect.
update Although John provides a 'correct' answer, in my case, it would require substantial modification of the original model to implement, so I would prefer to find a way to estimate the prior as a discrete step.
Questions
What is the best way to specify such a data informed prior?
If I am working with parameters for a particular species (monkeys), and this species belongs to a group of organisms(primates), and data are available for primates but not for monkeys, would it be appropriate to fit a distribution based on the primate data?
Example cases, first with proposed solution
I have 100 observations from 100 primate species of primate thumb length:
set.seed(0) thumb <- rgamma(100, 4, 0.1) library(MASS) fitdistr(thumb, 'gamma')
Indeed, when there is no apriori reason to select a particular distribution, the distribution can be chosen by maximum likelihood:
for(dist in c('gamma', 'lognormal', 'weibull') { logLik(fitdistr(thumb, dist)) }
I have collected 50 means, standard errors, and sample sizes from 50 different primate species, and 50 independent observations from another 50 species of eye diameter:
eye <- data.frame( diameter = rgamma(100, 4, 0.1), se = c(rlnorm(50, 0.5,1), rep(NA, 50)), n = c(rep(1:5, 10), rep(1, 50))) eye <- signif(eye, 3)
How can I incorporate the sample statistics into my calculation of a prior?