4

I am aware that there is a duality between random effects and smooth curve estimation. At this link, Simon Wood describes how to specify random effects using mgcv. Of particular note is the following passage:

For example if g is a factor then s(g,bs="re") produces a random coefficient for each level of g, with the radndom coefficients all modelled as i.i.d. normal.

After a quick simulation, I can see this is correct, and that the model fits are almost identical. However, the likelihoods and degrees of freedom are VERY different. Can anyone explain, statistically, the difference? Which one should be used for testing?

library(mgcv)
library(lme4)
set.seed(1)
x <- rnorm(1000) 
ID <- rep(1:200,each=5)
y <- x 
for(i in 1:200) y[which(ID==i)] <- y[which(ID==i)] + rnorm(1)
y <- y + rnorm(1000)
ID <- as.factor(ID)

# gam (mgcv)
m <- gam(y ~ x + s(ID,bs="re"))
gam.vcomp(m)
coef(m)[1:2]
logLik(m)

# lmer
m2 <- lmer(y ~ x + (1|ID))
sqrt(VarCorr(m2)$ID[1])
    summary(m2)$coef[,1]
logLik(m2)
linksys
  • 283
  • 1
  • 2
  • 7
  • Note to mods: I posted this on stackoverflow, which is probably a more appropriate venue. I tried to delete this post here but cannot figure out how. – linksys Feb 22 '16 at 17:10
  • 2
    I actually think this question is more appropriate here than at stack overflow. – Jake Westfall Feb 22 '16 at 17:17
  • Does this answer your question? http://stats.stackexchange.com/a/97522/1390 This is basically what I mentioned in the comments to your other [question](http://stats.stackexchange.com/q/197952/1390) – Gavin Simpson Feb 22 '16 at 20:14
  • I understand this dynamic and that it is a possible explanation, but is there any way to verify that? What you said about the degrees of freedom was very helpful – linksys Feb 22 '16 at 20:22
  • @linksys not without digging into the algorithms and approaches used by the respective software. This is implementation detail, which may not be a good fit for either site; you're basically going to have to dig down into the derivations of the likelihood functions used by the respective packages. I suspect the difference in the likelihoods is due to how the model is initially specified; as a mixed model or as a spline basis. I think I've answered the DF problem in the comments to the other post. I might add that here as a partial answer... – Gavin Simpson Feb 23 '16 at 15:49

0 Answers0