1

I have a data set including four columns. Group variable includes a control group, and a treatment group, measured at 6 times (time0,time1,time2,time3,time4,time5). In total, the data sets include 12 subjects (6 subjects in the control group and 6 subjects in the treatment group).

Subject  Time  Group   Analyte
1         0     1
1         1     1
1         2     1
1         3     1
1         4     1
1         5     1
2         0     1
2         1     1
2         2     1
2         3     1
2         4     1
2         5     1
3         0     1
3         1     1
3         2     1

As I have a small data set, which one of the following function should be used? Especially, Do you recommend to use the model 4 (with an unstructured correlation matrix and weighted)? Why?

The first model is:

aov (Analyte ~ Group*Time + Error(Subject/Time), data=data)

The second model is:

lmer(Analyte~Time + Group + Time:Group + (1|Subject))

The Third model is:

lmer(Analyte~Time + Group + Time:Group + (Group|Subject)) # with different slopes

The fourth model is:

 gls(Analyte ~ time * group, data = data, correlation = corSymm(form = ~ 1 | subject), weights = varIdent(form = ~ 1 | time))
Farid
  • 53
  • 6
  • Whether a random slope or intercept is more appropriate depends on the research question and your prior knowledge. I doubt model 3 could work on a limited dataset though, as `(Group | Subject)` estimates both. `(0 + Group | Subject)` is only random slopes. – Frans Rodenburg Aug 31 '18 at 01:04
  • @ Frans Rodenburg, thanks for the comment, what is your opinion about model one, aov function? How about gls function? – Farid Aug 31 '18 at 01:53
  • To be honest I would prefer a simple correlation structure like compound symmetry (which you essentially imply in models 2 and 3), because that leaves fewer parameters to be estimated. As for the anova version, I know there are some issues with the type III error, so I personally avoid it, but you could check out some Q&As by Ben Bolker (who wrote lme4) or amoeba (who is more familiar with the differences). I believe there is a question outlining all the different ways to model mixed effects. I'll link it if I can find it. – Frans Rodenburg Aug 31 '18 at 02:58
  • Didn't find exactly what I had in mind, but related are: (about `gls` vs `lmer`) https://stats.stackexchange.com/a/14185/176202 and (about RM-ANOVA vs mixed models) https://stats.stackexchange.com/q/24314/176202 – Frans Rodenburg Aug 31 '18 at 07:46

1 Answers1

1

You have too few subjects for the unstructured model. You will need to statistically select with, e.g., likelihood ratio tests, which is the appropriate random-effects structure.

Dimitris Rizopoulos
  • 17,519
  • 2
  • 16
  • 37
  • thanks for the comment. You mean as I have a small data set, I should use the unstructured model or not? You suggest using model 4? The dataset includes 12 subjects (6 subjects with group=1 and 6 subjects with group=2) and 5-6 time points. – Farid Aug 31 '18 at 05:11
  • It will not be advisable to use the unstructured covariance matrix in this case, because you have 6 measurements per subject, and this covariance matrix has 21 parameters to estimate, but you only have 12 subjects to estimate them from. – Dimitris Rizopoulos Aug 31 '18 at 07:23
  • @DimitrisRizopoulos, your answer appears to be a comment. Since you have enough reputation to use the comment section, please do so unless you intend to answer the question. – Frans Rodenburg Aug 31 '18 at 07:33
  • @FransRodenburg thanks for your feedback; however, I do believe that my response was an answer and not a comment, i.e., I suggested to use likelihood ratio tests to select the appropriate model, which was the original question, I think. – Dimitris Rizopoulos Aug 31 '18 at 08:05
  • @ Dimitris Rizopoulos, So I should use model 2? Am I right? Do I have to write my own corStruct class if I want to use gls? Could you please provide me with more information or the right code? – Farid Aug 31 '18 at 11:21
  • @ Dimitris Rizopoulos, can I use model 4 if I have just 5-time points (Repeats) instead of 6? – Farid Aug 31 '18 at 11:28
  • @Farid as mentioned in my answer, you will need to test to see which model is better. For nested models (e.g., models 3 and 4) you can do that using the `anova()` function that will perform a likelihood ratio test (BTW, for model 4 logically you would need Time | Subject not Group | Subject). – Dimitris Rizopoulos Aug 31 '18 at 11:34