Both in his book and on his blog, Larry Wasserman has discussed an example in which naive application of the Bayesian methods gives nonsensical answers.
Intro
The problem is to estimate the normalization $c$ of an un-normalized probability distribution: $g(x)$. The target value of $c$ is given by:
$$ c = \int g(x) $$
Pr. Wasserman shows that naive Bayesian estimation of $c$ gives stupid results. I'll let you check his blog. He offers as an open question the construction of a Bayesian estimator of c
This was discussed before on SE, and the answer was mostly that this is a silly example and that there is no reason to be doing Bayesian inference on this problem. But let me try anyway:
The sampling solution
To build a Bayesian estimator, let's first check out what conventional methods are available. One those method to estimate $c$ is rejection sampling: we generate samples from a second simple probability distribution $q(x)$. Noting those samples $x_k$, we then compute the empirical mean of the ratio: $\frac{g(x)}{q(x)}$ and obtain an unbiased estimator of $c$:
$$ \hat{c} = \frac{1}{n} \sum_{k=1}^n \frac{g(x_k)}{q(x_k)} $$
Bayesian importance sampling ?
Now, it might be stupid, but why couldn't we, at least in theory, construct a Bayesian estimator that takes as data the sequence $ \frac{g(x_k)}{q(x_k)} $ and tries to build a posterior of $c$ given those ?
For example, if we happen to be in a case where we known that the ratio $ \frac{g(x)}{q(x)} $ is bounded, we could model the observations as resulting from a beta distribution and do the conjugate inference. If we do not have an upper bound, we might model the observations as Gamma instead. We might even use likelihoods with no-conjugate priors and/or complicated priors.
My questions are:
Is this idea of doing Bayesian importance sampling something that has already been analyzed?
Does this, for some reason or another, fail to solve this problem of inferring the normalizing constant?