2

I want to calculate the "size" of the highest density region (HDR) that contains p% of the total probability for multivariate samples of a Bayesian posterior obtained via MCMC.

In 1D this "size" is the interval length of the HDR (the sum of the intervals in the case of multimodal distributions), in 2D it would be the area and so on...

Up to now, I calculated this "size" only for d=1 and d=2 by histogramming the (marginal) distributions and then summing the bins (starting with the highest) until I reach the p% and then multiply the number of bins needed with the bin size. This procedure however is not feasible in higher dimensions.

Is there a good way to obtain the "total size" of the HDR for all dimensions of the multivariate samples? I know there is the very nice paper "Computing and Graphing Highest Density Regions" on HDR by Rob J. Hyndman, but as far as I can see this does not indicate how to obtain the size of the HDR. (But maybe I am missing something there?)

The reason I want to calculate this is that in some applications, I interpret the size of the HDR as some kind of measure for uncertainty. I then want to compare the HDR size of one set of samples to another set of samples to see for which of them this uncertainty is smaller. So I would like to have some measure that allows to compare the HDR for different samples, considering all available dimensions, not only the 1D and 2D marginalizations since this can lead to ambiguities depending on which projection is considered for the comparison.

I also thought about just adding up the 1D HDR sizes or something like this, but I am not sure that it is possible to construct a good estimate for the total size of the HDR (in all dimensions) from only the marginal HDRs.

Any advice or idea would be appreciated!

balft
  • 21
  • 3

0 Answers0