0

I have Winbugs code for a zero-inflated Poisson (ZIP) model. I obtained this code from my lab at university and the person who wrote it is not accessible for me to ask questions. Here is the code:

model
{
for (i in 1:N) 
{
    z[i]<-0
    z[i] ~ dpois(phi[i])
    phi[i] <--ll[i]
    p[i]~ dbeta(1,1)
    ll[i]<-zero[i]*(log(p[i])-mu[i]+log(1-p[i]))+(1-zero[i])*(log(1-p[i])-
    mu[i]+y[i]*log(mu[i])-logfact(y[i])) #ll is log-likelihood
    zero[i] <- equals(y[i],0)
    log(mu[i]) <- log(e[i]+.0000001) + intercept + v[i]
    v[i] ~ dnorm(0, sig.v) 
}


# Other priors:
intercept ~ dflat()
sig.v ~ dgamma(0.1, 0.1)

}

I have 2 main questions:

1) In many of the Winbugs ZIP code I come across, such as here, I typically see dbern being used. However in this code I have, dbeta is used. Any clarification on this is appreciated, and

2) In the code, the variable ll is the log-likelihood. I am looking for a reference source for a formal definition of this equation from a book, journal, etc showing what are the various terms in this equation or an explanation showing what each terms in ll represents.

Any help is appreciated. I am also open to alternate formulations that may be equivalent to the code I posted.

user121
  • 289
  • 2
  • 14
  • This was crossposted at https://www.queryxchange.com/q/20_367219/how-to-interpret-zero-inflated-poisson-in-winbugs/. – jbowman Sep 21 '18 at 23:43
  • I did not post that....not sure how 'querychange' works but looks like it automatically copied my post from stack exchange. – user121 Sep 22 '18 at 02:46
  • It appears you're right... https://apple.meta.stackexchange.com/questions/3177/where-do-you-report-stackexchange-siterip . I have to say I'd never heard of it before my search for Winbugs zero-inflated Poisson code examples in response to your question popped up that site. – jbowman Sep 22 '18 at 02:55

1 Answers1

1

The likelihood function of a zero-inflated Poisson variate $x$ can be written as:

$$\mathcal{L}(\mu, p) = [p+(1-p)e^{-\mu}]^{1(y_i=0)}[(1-p)e^{-\mu}\mu^{y_i}/y_i!]^{1(y_i>0)}$$

where $p$ is the probability of $y_i=0$ due to the zero-inflation process, $1(y_i=0)$ is the indicator function that equals $1$ if the condition is true, $0$ otherwise, and $\mu$ is the mean of the Poisson variate. The first bracketed term on the left represents the two possible ways $y_i$ can equal $0$: first, if the zero inflation process takes over, which it does with probability $p$, and second, if the zero inflation process does not take over but the Poisson process generates a $0$ anyway, which happens with probability $(1-p)e^{-\mu}$. The second bracketed term represents the probability of $y_i|y_i > 0$, which can only occur if the zero-inflation process has not taken over.

Using the exponentiated $1(y_i=0)$ formulation allows us to avoid the awkward "if $y_i=0$ then ..." way of writing out the likelihood function, which leads to much messier (notationally speaking) expressions.

Note that $1(y_i=0)$ corresponds to the variable zero[i] in the code you've provided; you can see that from the line zero[i] <- equals(y[i], 0), and that $1(y_i>0)$ = 1 - zero[i] as well.

Taking the log gives us:

\begin{align} l(\mu, p) = &1(y_i=0)\log(p+(1-p)e^{-\mu}) +\\&1(y_i>0) [\log(1-p)-\mu + y_i\log(\mu) + \log(y_i!)] \end{align}

Translating this directly into code gives us:

ll[i] <-zero[i]*(log(p[i] + (1-p[i])*exp(-mu[i]))+(1-zero[i])*(log(1-p[i])- mu[i]+y[i]*log(mu[i])-logfact(y[i]))

where the only change required is subscripting the likelihood, $\mu$ and $p$ parameters.

This does not match with your code snippet, which has zero[i]*(log(p[i]) - mu[i] + log(1-p[i])) instead of zero[i]*(log(p[i] + (1-p[i])*exp(-mu[i])). It looks to me as though the original programmer mistakenly calculated the log of $p + (1-p)e^{-\mu}$ as the sum of the logs of the two additive terms: $\log(p) - \mu + \log(1-p)$, instead of: $\log(p + (1-p)e^{-\mu})$.

We can test the correctness of these two versions by implementing them, in this case in R, choosing parameters $\mu$ and $p$, then summing the exponentiated log likelihoods over $y=0$ to some number much larger than $\mu$ (large enough so that the total probability should be very close to one):

# Provided code 
foo <- function(y, p, mu) {
   zero <- y == 0
   ll <- zero * (log(p) - mu + log(1-p)) + 
      (1-zero) * (log(1-p) - mu + y*log(mu) - lfactorial(y))
}

# Derived code 
bar <- function(y, p, mu) {
   zero <- y == 0
   ll <- zero * log(p + (1-p)*exp(-mu)) + 
      (1-zero) * (log(1-p) - mu + y*log(mu) - lfactorial(y))
}

p = 0.5
mu = 0.2

y <- 0:10
> sum(exp(foo(y,p,mu)))
[1] 0.2953173
> sum(exp(bar(y,p,mu)))
[1] 1

... which would seem to confirm that the original code snippet is incorrect.

Using this way of formulating the likelihood, it is natural to put the prior distributions directly on $\mu$ and $p$, hence the use of p[i]~dbeta(1,1) (the uniform distribution on $(0,1)$). An alternative formulation creates "hidden variables" (the x[i] in the linked code) that take on the values 0 or 1 depending upon whether the zero-inflation process is "activated" or not for the $i^{th}$ observation. In this alternative formulation there are no indicator variables, therefore no zero[i] term or equivalent. Instead, the $x_i$ are treated in the same way as parameters (they are unobserved, after all) and estimated in the model in the same way that $\mu$ and $p$ are, hence the use of x[i]~dbern(pro[i]) in the linked code snippet. The end result of the estimation procedure should be the same, it's merely a matter of choice of algorithmic approach.

jbowman
  • 31,550
  • 8
  • 54
  • 107