1

I am modelling the average claim amount (claim amount/number of claims), so i used a gamma glm with weights=number of claims, as:

> model<-glm(claimamount/numclaims ~ 1 + veh_age + zone + insured_age,
> family=Gamma(link = "log"), weights = numclaims)

After this i want to find the log likelihood at each observation, but glm only return the sum, for instances:

> logLik(model) 
>'log Lik.' -70665.26

How can i get the loglikelihood for each observation?

I tried to do:

> x<-claimamount/numclaims 
> m<-model.matrix(model)       # covariates
> mu<-as.vector(exp(m%*%coef(model)))    #mean
> delta<-summary(model)$dispersion     # dispersion parameter
> ll<-log(dgamma(x,1/delta,1/(delta*mu)))

But when i sum all the loglikelihoods(to check if it is ok), i don't get the same value as in logLik(model).

My question is: what am i doing wrong? is it because i used weights in glm and i also have to use the weights in dgamma?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
maria
  • 11
  • 3
  • You will need to show us the details of what you're doing if you would like us to identify what is going wrong. If the call to `dgamma` literally is what you have computed, then evidently you need to take its logarithm using the `log=TRUE` argument. – whuber Jul 18 '17 at 12:50
  • @whuber i hope that it is more understandable now – maria Jul 18 '17 at 18:58
  • It is, thank you. Are you sure you have the arguments to `dgamma` correct? – whuber Jul 18 '17 at 20:37
  • @whuber it is correct when there is no weights, according to the glm/gamma parameterization... but i am not sure if it is the same when we have weights, it is one of my questions – maria Jul 19 '17 at 00:31

1 Answers1

1

The problem, as you noticed yourself, is that you use weights for your model, while you are not using them in your by-hand calculations. If you check the source of logLik.glm, you'll see that it uses AIC to calculate back the log likelihood:

p <- object$rank
## allow for estimated dispersion
if(fam %in% c("gaussian", "Gamma", "inverse.gaussian")) p <- p + 1
val <- p - object$aic / 2

Now compare the Gamma source to check how AIC is calculated:

dev.resids <- function(y, mu, wt)
    -2 * wt * (log(ifelse(y == 0, 1, y/mu)) - (y - mu)/mu)
aic <- function(y, n, mu, wt, dev){
n <- sum(wt)
disp <- dev/n
-2*sum(dgamma(y, 1/disp, scale=mu*disp, log=TRUE)*wt) + 2

it uses weighted deviance to calculate it. In fact, for weighted data we use weighted likelihood (see example here)

Tim
  • 108,699
  • 20
  • 212
  • 390