5

I have recently came across a number of articles that have used Poisson GLMs when modelling proportion data. For instance, one study modelled the proportion of pig mortality for each sow using a Poisson GLM in SAS.

My experience with proportion data has been to model it using a beta regression or binomial-logistic regression (beta-binomial?), taking into account the bounds in the distribution. By contrast, my understanding of Poisson models is that they are only suitable for count, since the distribution is discrete.

Could someone explain to me whether a) this is common practice, and b) what are the implications of modelling proportion data using Poisson models?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
user3237820
  • 540
  • 5
  • 12
  • You probably misunderstood this SAS code. The data necessarily are integer numbers if one fits a Poisson distribution. Maybe you should include this SAS code in your post. – Stéphane Laurent Jan 08 '16 at 09:52
  • 1
    I have a vague remembrance of a SAS code including something like `2/100`. And that does not mean $0.02$, that means something like "2 among 100". Maybe you saw something like this ? – Stéphane Laurent Jan 08 '16 at 09:55
  • 1
    @StéphaneLaurent: Could be that, but see also [How is it possible that Poisson GLM accepts non-integer numbers?](http://stats.stackexchange.com/q/70054/17230). For small proportions over a small range the variance might well be roughly proportional to the mean. A little more detail in the question would help. – Scortchi - Reinstate Monica Jan 08 '16 at 09:57
  • 1
    Were they perhaps modelling deaths per time units that pigs were observed for? In that case using a Poisson likelihood for the sufficient statistics of deaths & total follow-up time gives you an exponential time-to-event model. – Björn Jan 08 '16 at 11:58
  • @Björn: Good idea. More generally a count per *something* can be modelled with *something* in an offset term (see [When to use an offset in a Poisson regression?](http://stats.stackexchange.com/q/11182/17230)), & when the *something* is also a count we might be liable to take the count per *something* for a proportion even though it's not. – Scortchi - Reinstate Monica Jan 08 '16 at 13:57
  • Thanks for your responses! An example article is this one: http://www.jstor.org/stable/41414575?seq=4#page_scan_tab_contents. I do not know SAS, and there was no code provided, so @StéphaneLaurent I do not feel that I can answer your requests. Björn, they were modelling the proportion of live born piglets at farrowing. The offset example I am familiar with. – user3237820 Jan 09 '16 at 12:41
  • I notice "sibling competition" appears in the first sentence of the abstract, so there's probably a strong motive for not considering live births as independent Bernoulli random variables. (And, by the way, isn't it the proportion of the litter surviving till weaning they modelled as a Poisson r.v.?) – Scortchi - Reinstate Monica Jan 22 '16 at 10:55

1 Answers1

4

Yes. I believe you can use Poisson to model rates by having the denominator count as an offset on the right hand side. If you are modeling the rate $\frac{Y}{q}$, piglets per sow, then

$$\log\left(\frac{Y}{q}\right) \sim a + bx $$

can be written as

$$ \log(Y) \sim a +bx +\log(q) $$

so here the denominator is included as the offset, with a different value of $\log(q)$ for each $Y$ of course.

Ben Bolker
  • 34,308
  • 2
  • 93
  • 126