How is the accuracy of the results of Gibbs sampler measured?
Most resources merely say to iterate it $k$ times. But how does one infer the accuracy of the result?
How is the accuracy of the results of Gibbs sampler measured?
Most resources merely say to iterate it $k$ times. But how does one infer the accuracy of the result?
If you assume your draws have already converged to the stationary distribution, you need to estimate how variable this distribution is. Say you're using estimating $E[h(X)]$ with $\overline{h}_n(X) = n^{-1}\sum_{i=1}^nh(X_i)$ then you need to estimate its asymptotic variance: $$ \text{Var}(\overline{h}_n(X)) \approx \frac{\sigma^2}{n}\left[ 1 + 2 \sum_{i=1}^n\text{Corr}(X_1, X_{i+1})\right] $$ where $\sigma^2 = \text{Var}[h(X)]$ (derivation here). Notice that the uncertainty increases when your draws are more correlated. That means you definitely don't want to use the standard formula for sample variance that assumes your data is iid.
The two ways that I know of to estimate the above are spectral methods and different types of batch means methods. The spectral method comes from the fact that evaluating a spectral density at $0$ gives you almost the same expression above (i.e. $ 2 \pi f(0) = \sigma^2(1 + 2\sum_{i=1}^n\text{Corr}(X_1, X_{i+1}))$). Batch means, on the other hand, splits your samples up into batches/windows, takes the mean of each batch, and computes a typical sample variance for those batch means. If the batches are big enough, then those batch means have really weak dependence. The R
package mcmcse
handles both of these situations.
The Gibbs sampler accuracy is provided in the simulation-base bayesian inference background.
Before looking at the sampling accuracy, in such context, you need for:
The convergence statistical diagnostics help you to do that.
Shown below the main convergence criteria described in the literature:
Generally speaking, although all of these statistics aim to test for convergence, the third test reports whether the chain sample size is adequate enough to meet the required accuracy; the failure of the test suggests that you need for longer Markov Chain to achieve the convergence and an accurate estimate.
You can get a longer Markov Chain by moving the number of iterations (this is the reason why you found those resources suggesting you to increase the number of iterations) or the burn-in records.
Once the convergence is achieved, you have an accurate sampling, parameter estimates or whatever.
Woefully, it is not possible to know exactly how much your sampling is accurate; you can simply know only if the Markov chain coming out the sampling process converges or not.
I suggest you to read the following reference:
KOLASSA, John E., et al. Convergence and accuracy of Gibbs sampling for conditional distributions in generalized linear models. The Annals of Statistics, 1999, 27.1: 129-142.
Hope this helps.