I need to fit a GLM (generalized linear model) in R, something like
model <- glm(Daily.Results ~ sqrt.of.Clicks + Days + log.of.Clicks * sqrt.of.Days, family = poisson)
which relates
Daily.Results = variable that I need to count (and predict)
Clicks = counts clicks in a link (which drives a person to be a potencial buyer)
sqrt.of.Clicks = square root of Clicks
log.of.Clicks = natural logarithm of Clicks
Days = 1, 2, 3... sqrt.of.Days = square root of 1, 2, 3...
in a dataset with 14 rows/observations
and family = poisson
because Daily.Results
are non negative integers, with log link-function
The results of summary(model)
are:
Call:
glm(formula = Daily.Results ~ sqrt.of.Clicks + Days + log.of.Clicks *
sqrt.of.Days, family = "poisson")
Deviance Residuals:
Min 1Q Median 3Q Max
-2.24096 -0.61927 -0.09041 0.81640 2.45392
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 120.88890 22.33723 5.412 6.23e-08 ***
sqrt.of.Clicks 0.27487 0.07698 3.571 0.000356 ***
Days 0.35892 0.09334 3.845 0.000120 ***
log.of.Clicks -15.84863 3.20527 -4.945 7.63e-07 ***
sqrt.of.Days -23.43680 4.22029 -5.553 2.80e-08 ***
log.of.Clicks:sqrt.of.Days 2.66063 0.47141 5.644 1.66e-08 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 499.316 on 13 degrees of freedom
Residual deviance: 23.706 on 8 degrees of freedom
AIC: 141.81
Number of Fisher Scoring iterations: 4
The model above is the one which have the minimal AIC and Residual deviance (after some data transformation tentatives / log and sqrt of the data)
- I do not know very well how good or bad can be this model. By now, I often read people saying here that lower AIC and residual deviance means better fit. Is this correct and enough?
- Also, would be nice to know which value can be considered lower enough (R squared is so far more friendly). Can you help me with that?
- A really good fit can lead to bad predictions? Can you give some example?
- The diagnostic plots of my regression (
plot(model)
) follows, and I could not tell if some visual information can be found there. Perhaps, evidence of errors that I don't realized.
I think that a final and clean answer about this questions will help a lot of people here. Straight to the point.