0

There is an histogram, $h$, with user-defined $k = 5$ bins and probabilities $[1/2, 1/3, 1/30, 1/30, 1/10]$ for the each bin. Then $1000$ histograms were simulated. It is required to establish that the model data correspond to the original histogram. The original histogram (red) and one model histogram (blue) shown in figure.

enter image description here

I am looking for an approach to test the goodness of fit: the model data and the empirical histogram are drawn from the same distribution.

Question What is the general approach to such a problem?

I have calculated the Wasserstein distance, $d$ between the original and model histograms. The distribution of this statistic is shown in the figure below. Mean of the Wasserstein distance, $E(d) = 0.00267$.

enter image description here

Nick
  • 792
  • 5
  • 25
  • 1
    Use a chi-squared test. For an example, including code, see https://stats.stackexchange.com/a/307989/919. – whuber Jul 28 '21 at 12:17
  • @whuber, thank you for comment, did I understand correctly that first I have to find a suitable distribution for the d statistic, and then apply the chi-square test? – Nick Jul 29 '21 at 04:20
  • Isn't your "model histogram" the reference distribution? – whuber Jul 29 '21 at 12:43
  • @whuber, I hope that the red histogram is my reference distribution (defined by an user) while the blue histogram was plotted on model data. – Nick Jul 29 '21 at 15:59

0 Answers0