0

I'm learning about fitting methods and have a question about finding a meaningful fitting value.

Let's assume that we have a distribution of the weights of some people. If the distribution looks like Gaussian, then we can make a hypothesis that their mean weight is 70kg and the standard deviation is 5kg. Now we generate a histogram by taking into account this model and the total number of the people. After that we can compare the two histograms (measurements and simulation), by calculating the $\chi^{2}$ statistic. Given the number of the bins, $k$, we would know the degrees of freedom, $k-1$. Then using the $\chi^{2}$ and the degrees of freedom, we can check the $\chi^{2}$-distribution with $k-1$ degrees of freedom and find the p-value corresponding to our $\chi^{2}$ value. If we repeat this work for five different hypotheses, for instance, five different mean values with the standard deviation. Then we should have five p-values.

In this case, if the five p-values are all smaller than 1%, then can we say all of these models are rejected, and the best fit value(mean weight) among the five is anyway useless? Then, in order to achieve a good fitting, we would have to find models that result in large p-values and then find the best value among them?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Nownuri
  • 409
  • 2
  • 8
  • 1
    This recipe for the chi-squared test is not quite correct: for the nuances, see https://stats.stackexchange.com/questions/16921/how-to-understand-degrees-of-freedom/17148?r=SearchResults&s=1|51.3377#17148. Your question otherwise raises issues of *multiple comparisons* as well as *interpreting p-values;* you might appreciate some of our posts on these topics that can be found by searching our site. – whuber Sep 04 '19 at 14:50
  • 1
    This is an awfully complex procedure. Let's step back for a minute. What is the point of this? What are you trying to do? Most likely we will not end up going this route. – gung - Reinstate Monica Sep 04 '19 at 14:55
  • 1
    If your data are already binned in (presumably coarse) bins and you no longer have original values, then you might well use minimum chi-square to identify parameters (it's one approach to fitting discrete distributions), but you wouldn't use any comparison of p-values to choose between them, you'd simply minimize the chi-squared statistic – Glen_b Sep 05 '19 at 01:49
  • @Glen_b By 'minimizing the chi-squared statistic', do you mean one can assume different values of parameters and calculate the statistic, find the model resulting in the minimum chi-squared statistic? – Nownuri Sep 06 '19 at 16:36
  • 1. If you're choosing between fitted models, you can calculate a chi-square for each fitted model (with parameters estimated by whatever suitable method) and use it to choose the 'best fit' by the chi-square criterion. 2. However, yes you can use it to estimate parameters as well. It's a method with a long history, and in suitable cases often works extremely well (and in many situations is pretty easy); people have been using it for about a century but a lot of people are ignorant of it. . ...ctd – Glen_b Sep 07 '19 at 02:18
  • ctd ... See the 1980 [paper](https://projecteuclid.org/DPubS/Repository/1.0/Disseminate?view=body&id=pdf_1&handle=euclid.aos/1176345003) by Berkson, for example, or the wikipedia article [Minimum chi-square estimation](https://en.wikipedia.org/wiki/Minimum_chi-square_estimation) for example, I've seen it used when fitting smooth relationships of various forms to mortality data as a function of age (for a given gender) where deaths within each age are treated as either Binomial or Poisson. – Glen_b Sep 07 '19 at 02:24

0 Answers0