I ran a few glm and linear models with an offset. Each row in the dataset is a healthcare user. The data contains medical payments and icu days of each user between 2000 to 2007. As the number of years with institute contact differs between user (e.g. some came to the medical institute in 2001-2003 whereas there some who came on all years), I thought I should offset the number of years to account for "observation period". And it is common sense that the longer the years with contact the higher is the payment and icu days.
Gamma:
Call:
glm(formula = payment_amt ~ offset(log(years)) +
as.factor(gender) + age,
family = Gamma(link = "log"), data = pm, control = glm.control(maxit = 50))
Deviance Residuals:
Min 1Q Median 3Q Max
-3.8787 -1.2142 -0.5339 0.1904 15.1442
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 4.6718536 0.0134132 348.3 <2e-16 ***
as.factor(gender)M 0.7800695 0.0024625 316.8 <2e-16 ***
age 0.0642834 0.0001908 337.0 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Gamma family taken to be 1.238685)
Null deviance: 1520252 on 852449 degrees of freedom
Residual deviance: 1251859 on 852447 degrees of freedom
AIC: 20497443
Number of Fisher Scoring iterations: 8
OLS:
Call:
lm(formula = payment_amt ~ offset(years) +
as.factor(gender) + age,
data = pm)
Residuals:
Min 1Q Median 3Q Max
-170257 -53628 -23808 15835 8808825
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -206943.18 1425.83 -145.1 <2e-16 ***
as.factor(gender)M 48794.00 261.77 186.4 <2e-16 ***
age 3547.31 20.28 174.9 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 118300 on 852447 degrees of freedom
Multiple R-squared: 0.07811, Adjusted R-squared: 0.0781
F-statistic: 3.611e+04 on 2 and 852447 DF, p-value: < 2.2e-16
Poisson:
Call:
glm(formula = icu_days ~ offset(log(years)) + as.factor(gender) +
age, family = poisson(link = "log"),
data = pm)
Deviance Residuals:
Min 1Q Median 3Q Max
-56.95 -15.11 -7.11 3.22 747.64
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -5.518e-01 9.058e-04 -609.2 <2e-16 ***
as.factor(gender)M 6.357e-01 1.341e-04 4738.9 <2e-16 ***
age 6.395e-02 1.246e-05 5130.9 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for poisson family taken to be 1)
Negative binomial:
Call:
glm.nb(formula = icu_days ~ offset(log(years)) +
as.factor(gender) +
age, data = pm, init.theta = 0.9279403178,
link = log)
Deviance Residuals:
Deviance Residuals:
Min 1Q Median 3Q Max
-2.7720 -1.0788 -0.4652 0.1641 17.2095
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.0038131 0.0126237 -79.52 <2e-16 ***
as.factor(gender)M 0.5977179 0.0023017 259.69 <2e-16 ***
age 0.0708916 0.0001795 394.97 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Negative Binomial(0.9279) family taken to be 1)
May I know how do i interpret the coefficients?