I am trying to model weekly disease counts in 25 different regions within 1 country over a ten year period as influenced by temperature. The data is zero inflated and over dispersed.
I am most familiar with Stata but I don't think that there is any option amongst the gee
, xtmixed
, xtmepoisson
etc. commands that allows me to account for the zero inflation and over dispersion issues as well as the autocorrelation.
I log transformed the incidence data and used a SARIMA model but the residuals are not quite normal. I think that there are versions of the ARIMA model for integer data like disease counts but I can't find a program for it.
I was also thinking that I could create a hierarchical model with random intercepts for each region and random effects of temperature in each region, while also accounting for the regular seasonal disease cycle. I believe that I could model this in R using a package like glmm.admb but due to my limited statistical and R knowledge I am not entirely sure how to do use it. I am mainly confused about accounting for the autocorrelation and seasonal cycle part of the data using a program like this.
Any advice on how to best do this?