I am not entirely sure whether this would be better posted in Stack Overflow, or maths.se, however as it is about model fitting I thought I would try here first.
The Skellam distribution $\operatorname{Skel}(\mu_1, \mu_2)$ is the distribution of the difference between two independent Poisson random variables with mean parameters $\mu_1$ and $\mu_2$ respectively. The R package skellam
has a regression function skellam.reg
which returns MLE estimates of parameters $\beta_{in}$ where $i = 1,2$ in the following regression equations:
$$\ln(\mu_1) = \beta_{10} + \beta_{11} x_1 + \beta_{12} x_2 + \dots$$
$$\ln(\mu_2) = \beta_{20} + \beta_{21} x_1 + \beta_{22} x_2 + \dots$$
In the code there are three lines to find the mle
estimates:
mod <- stats::nlm(skelreg, stats::rnorm(2 * p), iterlim = 5000)
mod <- stats::nlm(skelreg, mod$estimate, iterlim = 5000)
mod <- stats::optim(mod$estimate, skelreg, hessian = TRUE,
control = list(maxit = 5000))
where the objective function is the log-likelihood given by skelreg()
. So it looks like parameter estimation happens in three steps:
- Draw the starting parameters from the normal distribution, find the best estimates using the non linear optimiser in 5000 iterations
- Reuse the estimates found in the previous step again in the non linear optimiser for another 5000 iterations.
- Take the estimates found in the previous step, and run it through R's general
optim
function.
I don't understand why the optimisation is broken down into these three steps when conceivably it is possible to do one call to nlm
with the option hessian = T
and iterlim = 15000
.
The full code is posted below for reference (in-line comments are mine):
skellam.reg <- function (y, x)
{
n <- length(y)
x <- stats::model.matrix(~., data.frame(x))
p <- dim(x)[2]
skelreg <- function(pa) { # return log-likelihood
b1 <- pa[1:p]
b2 <- pa[-c(1:p)]
a1 <- x %*% b1
a2 <- x %*% b2
lam1 <- exp(a1) # exp/log? link
lam2 <- exp(a2) # exp/log? link
a <- 2 * sqrt(lam1 * lam2)
sum(lam1 + lam2) + 0.5 * sum(y * (a1 - a2)) - sum(log(besselI(a,
y)))
}
options(warn = -1)
mod <- stats::nlm(skelreg, stats::rnorm(2 * p), iterlim = 5000)
mod <- stats::nlm(skelreg, mod$estimate, iterlim = 5000)
mod <- stats::optim(mod$estimate, skelreg, hessian = TRUE,
control = list(maxit = 5000))
b1 <- mod$par[1:p]
b2 <- mod$par[-c(1:p)]
s <- diag(solve(mod$hessian))
s1 <- sqrt(s[1:p])
s2 <- sqrt(s[-c(1:p)])
param1 <- cbind(b1, s1, b1/s1, stats::pchisq((b1/s1)^2, 1,
lower.tail = FALSE))
param2 <- cbind(b2, s2, b2/s2, stats::pchisq((b2/s2)^2, 1,
lower.tail = FALSE))
rownames(param1) <- rownames(param2) <- colnames(x)
colnames(param1) <- colnames(param2) <- c("Estimate", "Std. Error",
"Wald value", "p-value")
list(loglik = -mod$value, param1 = param1, param2 = param2)
}