0

I want to use several indicators to determine the goodness of fit of my model, such as R square, K-S, chi-square etc.

enter image description here

What should I call them? Can I call them goodness-of-fit indicators? I see a paper calling them statistics.

Nick Cox
  • 48,377
  • 8
  • 110
  • 156
cqcn1991
  • 1,145
  • 1
  • 10
  • 16

2 Answers2

4

In mathematics indicator has a very precise meaning:

In mathematics, an indicator function or a characteristic function is a function defined on a set $X$ that indicates membership of an element in a subset $A$ of $X$, having the value $1$ for all elements of $A$ and the value $0$ for all elements of $X$ not in $A$. It is usually denoted by a symbol $1$ or $I$, sometimes in boldface or blackboard boldface, with a subscript describing the set.

While in case of goodness of fit we are talking rather about measures, statistics, or tests, e.g.

The goodness of fit of a statistical model describes how well it fits a set of observations. Measures of goodness of fit typically summarize the discrepancy between observed values and the values expected under the model in question. Such measures can be used in statistical hypothesis testing, e.g. to test for normality of residuals, to test whether two samples are drawn from identical distributions (see Kolmogorov–Smirnov test), or whether outcome frequencies follow a specified distribution (see Pearson's chi-squared test). In the analysis of variance, one of the components into which the variance is partitioned may be a lack-of-fit sum of squares.

Notice that goodness of fit statistics measure how well does your morel fit the data. In some cases you can conduct a statistical test to check this. However they never indicate (i.e. show, point, demonstrate) that your model fits -- such decisions are always done subjectively, based on the available measures since it is never a yes/no kind of situation.

You might been thinking about "estimator" (as "estimator" and "indicator" sound similar), but this would also be incorrect since you are not estimating any property of your data.

So you can call them "measures", sometimes they are called "statistics", and if you were talking solely about tests (which $R^2$ is obviously not), then about "tests".

Tim
  • 108,699
  • 20
  • 212
  • 390
  • +1. The precise meaning of indicator in mathematics is well taken; indeed personally I much prefer the term _indicator variable_ to the awful and awkward _dummy variable_. But nothing can undermine the usefulness of indicator as a looser, informal and general term for any measure or piece of evidence that indicates. There is a splendid discussion early in Mosteller, F. and Tukey, J.W. 1977. _Data analysis and regression_. Reading, MA: Addison-Wesley of the importance of indication in statistics. There is a literature on "social indicators", and so on. – Nick Cox Apr 06 '16 at 07:57
  • @NickCox I would agree if we were talking about social science, biology etc., but if this is a paper with mathematical content that using a phrase that has a pretty precise meaning is not a good idea. Moreover, those statistics *measure* the extend of goodness of fit rather that *indicating* that something fits or not. – Tim Apr 06 '16 at 08:02
  • As in this thread http://stats.stackexchange.com/questions/202879/what-are-the-most-misused-statistics-terms-that-we-should-care-to-correct mathematicians or statistical people can use an existing word and give it an exact formal meaning, but they then should not imply that to be the only correct usage. The everyday senses of _group_, _field_ or _ring_ were not made obsolete when those terms were adopted in algebra. – Nick Cox Apr 06 '16 at 08:02
  • I don't know the OP's field and in any case the question is general. But in any applied field, it is prudent to use terms that people reading a paper will be comfortable with, just as it is also a good idea to use exact and correct terminology. There can be some tension between those aims. When I write in journals on geology, glaciology, forestry, etc. I need to think what will be clear and most easily understood. – Nick Cox Apr 06 '16 at 08:06
  • 1
    @NickCox totally agree :) The only thing that I'm saying is: "indicator" is an unfortunate choice as it's own meaning that can be understood differently than the authors intent. – Tim Apr 06 '16 at 08:13
  • Indeed; many threads on CV alone show that people with overlapping but not identical understanding and terminology can fail to understand each other almost completely. – Nick Cox Apr 06 '16 at 08:17
  • I assume R square could be could gof measures? But is there plural form when I stating multiple measures? – cqcn1991 Apr 06 '16 at 08:18
  • 1
    _Goodness-of-fit measures_ is plural, so I am not clear what else you seek. – Nick Cox Apr 06 '16 at 08:20
2

They are statistics insofar they are calculated from samples, but that term is too broad to be a good answer.

The term goodness of fit is well established, except that some of those measures are inverse and really measure badness of fit, a term occasionally used, by J.B. Kruskal among others. But that objection is at most a quibble, and it's well understood in statistics, as elsewhere, that performance measures can be direct or inverse, so that usually people want high $R^2$ and low chi-square, or low inflation and high GDP growth, or whatever.

As another example, root mean square error should ideally be low, with nothing else said.

(If you know about overfitting, you'll know the limitations of these comments, but here the focus is on terminology.)

I like the term figures of merit, as an all-purpose term used in several fundamental and applied sciences. Again, merit can be direct (number of Nobel Prizes won) or inverse (number of downvotes on Cross Validated).

The term indicator is not especially good for your purpose and certainly not synonymous with goodness-of-fit measures. As @Tim explains in his answer, there is a much more specific technical sense, and as I emphasise in my comments on his answer, there is a much wider, less technical sense for the term indicator. There is no middle ground in which indicators mean goodness-of-fit statistics, exactly. But if you were to talk about goodness-of-fit indicators, I don't think you would be misunderstood. It's just not an especially good term for the purpose.

Nick Cox
  • 48,377
  • 8
  • 110
  • 156