I have the following question about "Data Snooping" in Bayesian Models.
In Bayesian Models, priors are generally said to come from an independent source, such as domain experts on the data being used. Suppose you want to fit the following model to your data:
Y = b_0 + b_1*X1
You decide that you want to place a normal prior on b_0 ~ N(mu_1, sigma_1), a normal prior on b_1 ~ N(mu_2, sigma_2) and a log-normal prior on sigma ~N(mu_3, sigma_3).
Suppose you choose some values for mu_1, sigma_1, mu_2, sigma_2, mu_3, sigma_3:
- (Model 1) Choice 1: mu_1 = a1, sigma_1 = a2, mu_2 = a3, sigma_2 = a4, mu_3 = a5, sigma_3 = a6
- (Model 2) Choice 2: mu_1 = b1, sigma_1 = b2, mu_2 = b3, sigma_2 = b4, mu_3 = b5, sigma_3 = b6
Based on a randomly selected 70% sample of the data, you then find the Bayesian estimates (e.g. MAP) of b_0, b_1 and sigma for Choice 1 and Choice 2.
You know want to evaluate the performance of both models on the other 30% of the data. Suppose you notice that the MSE of Model 1 is lower than the MSE of Model 2, indicating that Model 1 is better.
My Question: Since in this case you have effectively treated the "Bayesian Priors" as "hyperparameters" and selected priors based on the model performance - is this the equivalent of "data snooping"? Should Bayesian Priors always be selected "prior" to fitting the model, and the choice of these priors have no relation to the actual model performance?
Thanks!