I am confused about if you can, and how to, make population average predictions from a fitted GAM? Any advice or directions to good worked examples would be much appreciated?
I am using GAMs to model growth through time across 200 animals, example data are here.
I read in my example data
test <- read.csv("test.csv", header = T)
test$tagged <- factor(test$tagged)
test$sex_t0 <- factor(test$sex_t0)
test$scale_id <- factor(test$scale_id)
and run my model
gam1 <- gam(weight_t ~
tagged +
sex_t0 +
s(age.x, k = 6) +
s(scale_id, bs = "re") +
s(age.x, scale_id, bs = "re"),
data = long,
method = "REML",
family = Gamma(link = "log"))
I then create a new data frame to predict from
pred.dat <- data.frame(tagged = c(rep(0, 752), rep(1, 752)),
sex_t0 = c(rep("f", 376), rep("m", 376), rep("f", 376), rep("m", 376)),
age.x = c(rep(seq(9, 384, 1), 4)),
scale_id = rep(1, 1504))
pred.dat$tagged <- factor(pred.dat$tagged)
pred.dat$sex_t0 <- factor(pred.dat$sex_t0)
pred.dat$scale_id <- factor(pred.dat$scale_id)
When predicting from my fitted GAM I can use the exclude =
option, which I understand sets my random effects to 0 and essentially does not account for them when making predictions, see here. This is also suggested by the plots that I produce which shows confidence intervals increasing greatly through time, suggesting that the random intercept and slope that I have included in my model has not been accounted for when predicting (some increase in variation with age would be expected, but not as much as is shown if this was a population averaged prediction).
preds <- predict(gam1,
newdata = pred.dat,
exclude = c("s(scale_id)",
"s(age.x, scale_id)"),
se = T, type = 'response')
I interpret the help page for predict.gam (perhaps incorrectly) that type="iterms"
can be used to produce population averaged predictions from a fitted gam. However, if I use this I no longer get a single estimated value for the predictions and their standard errors.
preds <- predict(gam1,
newdata = pred.dat,
se = T, type = "iterms")
Any advice on how to produce population averaged predictions from a fitted gam would be appreciated? I have read a number of pages but remain confused (here, here, here, here, and others).