Interpretation of "virtual observations" in Bayesian inference

Question

In the context of using a Normal-Gamma conjugate prior, scale $\eta_0$ can be interpreted as "virtual observations" which is a multiple of variance. Why then is the higher the "virtual observations", the higher the variance of $\mu$? I would have thought more "observations" will lead to lower variance? Is "virtual observation" a misnomer?

See page 13 of this doc.

score 3 · Accepted Answer · answered Dec 28 '21 at 13:04

Notice what they say

To simplify the algebra we will work with precisions instead of variances. ... in summary, by starting with a normal-gamma prior, we obtain a normal-gamma posterior; i.e., we have found a conjugate prior for the mean and precision of the Gaussian.

They parametrize the Gaussian by mean and precision $\tau$, where the variance would be inverse of the precision. Gathering more data leads to more precision. Higher precision = lower variance, so everything is as expected.

The "virtual observations" are most easily explained using beta-binomial, or Dirichlet-categorical models, where the prior parameters can be thought as counts of "successes" observed a priori. With other models, this intuition may be harder to gasp.

Thanks, I've misread. I've got confused with the precision parameterisation. — stevew, Dec 28 '21 at 22:56
@stevew it's uncommon parametrization, but you can see it sometimes in Bayesian literature because, as they say, using gamma prior for precision gives a little bit more simpler algebra that using inverse-gamma for variance. — Tim, Dec 29 '21 at 09:24

Interpretation of "virtual observations" in Bayesian inference

1 Answers1