4

I am modelling timeseries of organism traits (7 different traits in total) using GLM(M)s in R. The data was collected at very irregular intervals and from 6 different locations. For every location 5-10 animals were sampled and some locations have been sampled multiple times whereas others just once or twice. I am interested to see if Trait significantly changes with Year.

I decided to use Location as a random variable so that the model is of the form:

glmer(Trait~ Year + Var1 + (1|Location), 
               family=gaussian(link = "log"),
               data = data)

The model diagnostics lock good and don't give any reasons for concerns to me.

I did just for the sake of it model the same traits using multiple linear regression, this time having Location as a fixed term. This required the log transformation of Traits

lm(log(Trait) ~ Year + Var1 + Location, data = data)

Also for these models the diagnostics look good. Both models suggest the same trend in the data.

Now the problem that I am facing is decide what model I should use? In the literature it is commonly mentioned that the simplest possible statistical tool should be favored over more complicated ones which suggests to me I should favor the linear regression but one is also discouraged to transform data to fit the model.

Robert Long
  • 53,316
  • 10
  • 84
  • 148

1 Answers1

7

6 is considered to be at or very near the lower limit for fitting random intercepts. In your case I would suggest the linear model with fixed effects for location. If this is for publication / presentation then I would also mention that you have fitted a mixed model with random intercepts for location, and found that the inferences were very similar.

Also, note that, to be consisent, you should fit the models:

glm(Trait ~ Year + Var1 + Location, family = gaussian(link = "log"), data = data)

and

glmer(Trait ~ Year + Var1 + (1|Location),family = gaussian(link = "log"), data = data)

OR,

lm(log(Trait) ~ Year + Var1 + Location, data = data)

and

lmer(log(Trait) ~ Year + Var1 + (1|Location), data = data)
Robert Long
  • 53,316
  • 10
  • 84
  • 148