I need to model a certain response variable in a GLM model. The response variable is a count (amount of insurance claims over a year). So it would be natural to assume a Poisson distribution for this response variable. However, using a goodness of fit test in R, how can I validate this assumption?
Asked
Active
Viewed 361 times
2
-
1Sorry, if your response variable depends on covariates (which it should if you want to model with a GLM) it probably is not Poisson distributed. However, that does not matter for regression. What matters is the residual distribution. – Roland May 14 '18 at 13:03
-
You will need poisson regression, and then this post will be useful: https://stats.stackexchange.com/questions/331086/investigate-overdispersion-in-a-plot-for-a-poisson-regression/346041#346041 – kjetil b halvorsen May 14 '18 at 13:20
-
[This](https://cran.r-project.org/doc/contrib/Ricci-distributions-en.pdf) document has a good run down of most distribution goodness-of-fit measures in R. I would particularly look at `fitdistr()` in the `MASS` package and `goodfit()` in the `vcd` package. – NatWH May 14 '18 at 12:59
-
Possible duplicate of [Diagnostic plots for count regression](https://stats.stackexchange.com/questions/70558/diagnostic-plots-for-count-regression) – kjetil b halvorsen Jun 27 '18 at 11:19