Why does adding an offset change the coefficients in a Poisson regression?

Question

Suppose I run Poisson regressions but every time the only difference is the offset. Why are my estimated coefficients different? The offset is just like any other predictor in a linear model, the coefficients of the other terms shouldn't change when it is uncorrelated.

For example if you run the last line a few times the $x$ coefficients will be different.

x      <- rnorm(100, sd = 0.1)
y      <- rpois(100, exp(5 * x))

summary(glm(y ~ x, family = 'poisson', offset = log(rpois(100, 5) + 1))

Thomas Bilach · Accepted Answer · 2020-05-28T00:24:41.707

Why are my estimated coefficients different?

They should be different. You're no longer modeling count data. You're modeling rates.

The offset is just like any other predictor in a linear model, the coefficients of the other terms shouldn't change when it is uncorrelated.

No. The offset is not your typical covariate. The offset is a predictor whose coefficient is constrained to equal 1. If you moved the offset to the left-hand side and invoked the properties of logarithms you end up with your outcome divided by your offset. See this post for more information on the derivation. Note, once you weight by your offset/exposure (e.g., time, population size, geographic area, etc.), your coefficient on $X$ should change.

The R code below should help with the intuition. I modeled the outcome using two methods which produce similar results. The first method uses offset(.) inside of the glm() function. The second method models the rate explicitly. Note, once we divide the outcome $y$ by the exposure $e$, it alters the variance of the response. To correct for this, we weight by the offset (e.g., weight = e) when fitting the model. Both approaches produce identical coefficents.

# R Example (Poisson Exposures)

set.seed(13)

x <- rnorm(100, sd = 0.1)
y <- rpois(100, exp(5 * x))

e <- rpois(100, 5) + 1       # this is your offset/exposure
y_weighted <- y / e          # weighting by your offset/exposure

### --- Using offset(.)

mod_1 <- glm(y ~ x + offset(log(e)), family = 'poisson')

### --- Using the weighted outcome

mod_2 <- glm(y_weighted ~ x, family = 'poisson', weights = e)

round(mod_1$coefficients, 3)
(Intercept)           x 
     -1.904       5.907 
round(mod_2$coefficients, 3)
(Intercept)           x 
     -1.904       5.907

Again, there is no coefficient estimated on your exposure variable. You are not holding $e$ fixed while assessing the impact of $x$ on $y$. You are, in fact, dividing the outcome by $e$ (i.e., the offset/exposure). Correlatedness between the exposure variable and other regressors shouldn't concern you in this setting. See this answer for more information.

The Poisson glm throws a warning but it shouldn't concern you. For model 2, try `family = 'quasipoisson'` to rid yourself of any messages. The coefficients shouldn't change. — Thomas Bilach, May 27 '20 at 21:27

Why does adding an offset change the coefficients in a Poisson regression?

1 Answers1