1

We are hoping for some guidance regarding a gam model using mgcv R package. We want to know if variables measured over time affect our outcome variable. In other words if variable "X" and/or variable "Y" changes after controlling for time.

We have the following variables:

  1. ID - subject identification
  2. year - continuous covariate, repeated measures
  3. Var X - continuous variable
  4. Var Y - continuous variable
  5. Outcome - continuous variable, dependent variable

We are not entirely sure if the following model answers our question:

m1 <- gam(Outcome ~ s(VarX) + s(VarY) + s(year) +
      s(year, ID, bs = "fs"),
      family = gaussian, data = dat, method = 'REML')

Or if the model below is the one taking into account that each variable is measured over time for each ID:

m2 <- gam(Outcome ~ s(VarX) + s(VarY) + s(year) +
     s(year, ID, bs = "fs", by = VarX) +
     s(year, ID, bs = "fs", by = VarY),
     data = dat, method = 'REML')
DSan
  • 13
  • 3

1 Answers1

1

Maybe something like this?

m2 <- gam(Outcome ~ s(VarX) + s(VarY) + s(year) +
     s(VarX, ID, bs = "fs") +
     s(VarY, ID, bs = "fs"),
     data = dat, method = 'REML')

This is similar to a random slope and intercept model with a statistical control for year at the global level. If you were to include the interaction between VarX/VarY and year, then you would want to fit it using a tensor product rather than an isotropic smoother (te() rather than s()). However, the factor smooth interaction won't work for te smooths. Pedersen et al. 2019 has a general form for fitting these higher dimensional models:

y ~ te(x1, x2, bs = "tp", m = 2) +
    t2(x1, x2, fac, bs = c("tp","tp","re"), m = 2, full = TRUE)

where x1/x2 are continuous covariates and fac is your grouping variable. The tensor product smooth allows for modeling interactions among covariates with different units, whereas the isotropic smoothers work best when modeling covariates with the same units (commonly units of distance).

References:

Pedersen, Eric J., et al. "Hierarchical generalized additive models in ecology: an introduction with mgcv." PeerJ 7 (2019): e6876.

  • 1
    Thank you! It worked. I guess if we are looking at the time interaction we should use ti(). – DSan Aug 09 '21 at 15:19
  • Those `"fs"` smooths aren't defined correctly; you want `s(VarX, ID, bs = 'fs')`, i.e. the factor is supplied after the continuous covariate. What you are doing is essentially equivalent to `ID + s(VarX, by = ID)`. If you want separate smoothness parameters over the levels of `ID`, then drop the `"fs"` bit and just use `ID + s(VarX, by = ID)`. But you can't mix these with the data the OP has as there is but one level in the hierarchy. More generally one can have `f1 + s(x1, f2, bs = "fs", by = f1)` where f1 and f2 are factors, but that isn't the case here. – Gavin Simpson Aug 13 '21 at 07:32
  • Thanks for the correction, @GavinSimpson. So to be clear, the correct form is`s(VarX, ID, bs = 'fs')`, which is not equivalent to `s(X, by = ID, bs = 'fs')`? Is it uncouth to correct my answer to reflect the correction, or should I just leave it as it? – Sean Hardison Aug 13 '21 at 22:45
  • 1
    Yes; that is the correct form. Yes; those two formulations aren't equivalent. No; it is not uncouth to edit one's answers here and on other [se] websites. In fact it is an expected part of the interface and gamification behind the rep system. So, please do edit your answer to reflect the comments. – Gavin Simpson Aug 15 '21 at 06:36