1

I am trying to construct GLMs to explore the relationship between woodpecker abundance and 9 predictor variables (tree density, canopy cover, and so on). My data were collected as counts of woodpeckers, but I averaged the counts from multiple visits to each site (each site had between 1-6 visits made). So my data are not whole integers. They contain decimals, and I have a decent amount of zeros (when woodpeckers were not observed).

I am hitting problems with Poisson distribution because of the decimals. I tried adding an offset function but I still get errors in R because of the decimals:

>In dpois(y, mu, log = TRUE) : non-integer x = 0.917000...

Gamma distribution won't work because I have zeros. Is negative binomial the best bet? Maybe I am misunderstanding Neg Bin but I thought it was best for presence-absence data, and not continuous count data.

Here is the script I tried for the offset. I may be doing this wrong:

model <- glm(woodpeckercount ~ treedensity+canopy+snags+offset(log(visits)), data=woodpeckers, family=poisson)

visits is the number of visits I made to each survey site (ranges from 1 to 6). It is the number I divided the total woodpecker count by to get the average. When doing an offset, would I use the total woodpecker count for each site as my response variable instead of the average (in other words, should I not divide woodpecker count by the number of visits, since the offset function accounts for this?). I want to be sure I am accounting for unequal sampling effort between sites.

Silverfish
  • 20,678
  • 23
  • 92
  • 180
adam baz
  • 11
  • 3
  • You should use the offset approach with the "total count" variable as the response. Due to the log-link in Poisson or negbin regression this then amounts to: `log(E(total)) = log(visits) + x'b`. If you bring the offset `log(visits)` to the left-hand side of the equation you can pull them into the log and the expectation: `log(E(total/visits)) = x'b`. Thus, this gives the response you want. For details see: https://stats.stackexchange.com/questions/11182/when-to-use-an-offset-in-a-poisson-regression – Achim Zeileis Mar 08 '18 at 20:27
  • @Achim Zeileis thank you for the response. That makes sense. Just to clarify, if I use the "total count" as my response variable and I use an offset (like you suggested), is there anything special I have to do to interpret my models? Or can my models and outputs be treated just like a normal GLM? – Adam Baz Mar 09 '18 at 02:36
  • Please register &/or merge your accounts (you can find information on how to do this in the **My Account** section of our [help]), then you will be able to edit & comment on your own question. – gung - Reinstate Monica Mar 09 '18 at 02:37
  • I'm not sure what a "normal GLM" entails here. But you clearly have a log-linear model for the expected rate `total/visits`. Thus a one-unit change in the regressor leads to a relative change in the rate. – Achim Zeileis Mar 09 '18 at 03:00
  • One other question about using offsets, when I am trying to create a null (intercept only) model, what would the formula be? Would it be : null1 – adam baz Mar 14 '18 at 05:09
  • @AchimZeileis I was just wondering if I can interpret my final model (after performing variable reduction on the full model) as I would if I was building a GLM without an offset? In other words, will the variables present in my final reduced model (and their coefficients) be interpreted easily, or do I need to account for the offset somehow? – adam baz Mar 14 '18 at 06:01

0 Answers0