90

Does anybody know why offset in a Poisson regression is used? What do you achieve by this?

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
MarkDollar
  • 5,575
  • 14
  • 44
  • 60

1 Answers1

126

Here is an example of application.

Poisson regression is typically used to model count data. But, sometimes, it is more relevant to model rates instead of counts. This is relevant when, e.g., individuals are not followed the same amount of time. For example, six cases over 1 year should not amount to the same as six cases over 10 years. So, instead of having

$\log \mu_x = \beta_0 + \beta_1 x$

(where $\mu_x$ is the expected count for those with covariate $x$), you have

$\log \tfrac{\mu_x}{t_x} = \beta'_0 + \beta'_1 x$

(where $t_x$ is the exposure time for those with covariate $x$). Now, the last equation could be rewritten

$\log \mu_x = \log t_x + \beta'_0 + \beta'_1 x$

and $\log t_x$ plays the role of an offset.

ocram
  • 19,898
  • 5
  • 76
  • 77
  • 2
    Hey Thanks much! So did I get it right that it is neccessary to use an offset, when you compare counts over different times? – MarkDollar May 24 '11 at 19:34
  • 5
    You should weight by $t_x$ when you model the rates. More generally, you use offsets because the units of observation are different in some dimension (different populations, different geographic sizes) and the outcome is proportional to that dimension. – dimitriy Oct 04 '13 at 02:36
  • What happens if $t_x=1 \ \forall x$ but the counts, i.e., steps from one day to another, can be more then one? – Druss2k Oct 25 '13 at 07:01
  • Are we always setting the continuous variable as the offset? Or it can be anything? – TYZ Jun 30 '14 at 19:48
  • 1
    @ocram. I think yours is a nice answer and I was wondering, do you (or does anybody else) know a literature reference where the issue is explained as it is here? Thank you in advance – jmjr Dec 14 '15 at 11:26
  • 1
    @ocram what do you mean by $x$ and $t_x$? what is the response variable for each $x_i$? – Metariat Jul 18 '16 at 16:44
  • I would just mention that a now-deleted answer refers to Faraway *Extending the Linear Model with R* 2d ed. p. 70 (an example on rates of chromosomal abnormalities, using the `dicentric` data set from the `faraway` package – Ben Bolker Jul 10 '20 at 20:34
  • So you can still use a Poisson loss function in this case as long as you add the "exposure" to the prediction? – thecity2 Jan 04 '22 at 19:32