Questions tagged [goodness-of-fit]

Goodness of fit tests indicate whether or not it is reasonable to assume that a random sample comes from a specific distribution.

"They are a form of hypothesis testing where the null and alternative hypotheses are:

H0: Sample data come from the stated distribution
HA: Sample data do not come from the stated distribution

These tests are sometimes called omnibus tests.

Reference:

Ricci, V. (2005). Fitting distributions with R.
Retrieved from: http://cran.r-project.org/doc/contrib/Ricci-distributions-en.pdf
page 16.

1058 questions
183
votes
2 answers

How to determine which distribution fits my data best?

I have a dataset and would like to figure out which distribution fits my data best. I used the fitdistr() function to estimate the necessary parameters to describe the assumed distribution (i.e. Weibull, Cauchy, Normal). Using those parameters I…
68
votes
8 answers

Which pseudo-$R^2$ measure is the one to report for logistic regression (Cox & Snell or Nagelkerke)?

I have SPSS output for a logistic regression model. The output reports two measures for the model fit, Cox & Snell and Nagelkerke. So as a rule of thumb, which of these $R^²$ measures would you report as the model fit? Or, which of these fit indices…
Henrik
  • 13,314
  • 9
  • 63
  • 123
45
votes
2 answers

PP-plots vs. QQ-plots

What is the difference between probability plots, PP-plots and QQ-plots when trying to analyse a fitted distribution to data?
kay
  • 581
  • 1
  • 4
  • 3
45
votes
8 answers

How can I test if given samples are taken from a Poisson distribution?

I know of normality tests, but how do I test for "Poisson-ness"? I have sample of ~1000 non-negative integers, which I suspect are taken from a Poisson distribution, and I would like to test that.
34
votes
2 answers

Raw residuals versus standardised residuals versus studentised residuals - what to use when?

This looks like a similar question and didn't get many responses. Omitting tests such as Cook's D, and just looking at residuals as a group, I am interested in how others use residuals when assessing goodness-of-fit. I use the raw residuals: in a…
Michelle
  • 3,640
  • 1
  • 23
  • 33
34
votes
2 answers

Degrees of freedom of $\chi^2$ in Hosmer-Lemeshow test

The test statistic for the Hosmer-Lemeshow test (HLT) for goodness of fit (GOF) of a logistic regression model is defined as follows: The sample is then split into $d=10$ deciles, $D_1, D_2, \dots , D_{d}$, per decile one computes the following…
34
votes
6 answers

Interpretation of Shapiro-Wilk test

I'm pretty new to statistics and I need your help. I have a small sample, as follows: H4U 0.269 0.357 0.2 0.221 0.275 0.277 0.253 0.127 0.246 I ran the Shapiro-Wilk test using R: shapiro.test(precisionH4U$H4U) and I got the…
29
votes
6 answers

How can I test the fairness of a d20?

How can I test the fairness of a twenty sided die (d20)? Obviously I would be comparing the distribution of values against a uniform distribution. I vaguely remember using a Chi-square test in college. How can I apply this to see if a die is…
28
votes
1 answer

Kolmogorov-Smirnov with discrete data: What is proper use of dgof::ks.test in R?

Beginner questions: I want to test whether two discrete data sets come from the same distribution. A Kolmogorov-Smirnov test was suggested to me. Conover (Practical Nonparametric Statistics, 3d) seems to say that the Kolmogorov-Smirnov Test can be…
Mars
  • 888
  • 2
  • 10
  • 20
28
votes
7 answers

Distribution hypothesis testing - what is the point of doing it if you can't "accept" your null hypothesis?

Various hypothesis tests, such as the $\chi^{2}$ GOF test, Kolmogorov-Smirnov, Anderson-Darling, etc., follow this basic format: $H_0$: The data follow the given distribution. $H_1$: The data do not follow the given distribution. Typically, one…
27
votes
2 answers

What's the Bayesian equivalent of a general goodness of fit test?

I have two data sets, one from a set of physical observations (temperatures), and one from an ensemble of numerical models. I'm doing a perfect-model analysis, assuming the model ensemble represents a true, independent sample, and checking to see if…
naught101
  • 4,973
  • 1
  • 51
  • 85
27
votes
4 answers

What does negative R-squared mean?

Let's say I have some data, and then I fit the data with a model (a non-linear regression). Then I calculate the R-squared ($R^2$). When R-squared is negative, what does that mean? Does that mean my model is bad? I know the range of $R^2$ can be…
RockTheStar
  • 11,277
  • 31
  • 63
  • 89
27
votes
3 answers

Evaluating logistic regression and interpretation of Hosmer-Lemeshow Goodness of Fit

As we all know, there are 2 methods to evaluate the logistic regression model and they are testing very different things Predictive power: Get a statistic that measures how well you can predict the dependent variable based on the independent…
26
votes
3 answers

Is my model any good, based on the diagnostic metric ($R^2$/ AUC/ accuracy/ RMSE etc.) value?

I've fitted my model and am trying to understand whether it's any good. I've calculated the recommended metrics to assess it ($R^2$/ AUC / accuracy / prediction error / etc) but do not know how to interpret them. In short, how do I tell if my model…
mkt
  • 11,770
  • 9
  • 51
  • 125
24
votes
3 answers

How do I check if my data fits an exponential distribution?

How could I check if my data e.g. salary is from a continuous exponential distribution in R? Here is histogram of my sample: . Any help will be greatly appreciated!
stjudent
  • 565
  • 3
  • 6
  • 11
1
2 3
70 71