1

I have a dataset of ratings for participants of a within subject design with 3 conditions and 3 trials per condition. The participants were divided into three groups and participants of each group were rated by a different group of raters. The problem is that I have an unequal amount of data points between the groups of rated subjects. This is caused by a different number of raters (and hence ratings) per group of participants. The first group of 5 participants was rated by 5 raters (5x9x5 ratings), the second group of 5 by 9 raters (5x9x9 ratings) and the third group of 5 by 7 raters (5x9x7 ratings). So for some participants I have 3x5 ratings per condition, for others 3x9 ratings per condition.

I would like to know if it is a problem to have such an unequal amount of data points for participants if I plan to perform a linear mixed model (with a random effect of participant and rater) on this data set to test on differences between conditions. And in case it is a problem, how to best deal with it. I would be grateful for any help.

Best Pearson

Pearson
  • 17
  • 4
  • Please edit your question to say more about what you mean by "unequal amount of data points per participant." For example: were some not observed under one or more of the 3 conditions? Are you evaluating something like changes over time, with different time points for different participants? The more details you can provide about the structure of your study, the better any answer is likely to be. Please provide that extra information by editing the question, as comments are easy to overlook and can get deleted. – EdM Nov 24 '21 at 15:54
  • @EdM I hope the edit addressed all the issues raised. Or are any further clarifications needed? – Pearson Nov 26 '21 at 14:01

1 Answers1

0

Treating participants and raters as random effects, with condition as a fixed effect, is a reasonable way to approach this situation. The modeling of random effects involves partial pooling, such that participants/raters with more observations are weighted more heavily than others, while avoiding the large number of degrees of freedom that would be used up in modeling the participants and raters individually. This page and its links provides a useful entry to those issues. I find this answer in particular to be enlightening.

You still have to be careful, though.

First, think through what you are modeling. Random intercepts posit Gaussian variability about the overall model intercept, which in your situation would be the response at the reference value of condition. So random intercepts for participants and raters assume one such distribution among your 15 total participants and a separate one among your 9 (or more) raters. The fixed effects for the other 2 levels of condition are assumed the same among all participants and raters. If that makes sense in your situation, OK, but don't just proceed blindly.

Second, you have 2 ways of fitting the model, maximum likelihood (ML) and restricted maximum likelihood (REML). Make sure that you understand which you are choosing (even if by default) and why.

Third, inference is not necessarily straightforward, in part because there isn't universal agreement about how many degrees of freedom should be assigned to the modeled random effects. See this page and its links for further reading. You should understand the assumptions underlying any p-values and the like that you generate.

EdM
  • 57,766
  • 7
  • 66
  • 187
  • Thank you very much! I think I have to get the Gelman&Hill book to get more detailed insights, or do you have any other recommendations? Regarding the random intercepts: I think that is also one of the problems. In total there were 15 subjects (P) and 21 raters and as explained 5 of them rated 5 of the participants (P 1-5) (in all conditions), 9 rated further 5 participants (P 6-10) and 7 rated the last 5 subjects (P 11-15). Hence, I thought it is not possible to assume a common distribution for ratings for participants (as they were rated by different groups of raters). Or am I mistaken? – Pearson Nov 26 '21 at 19:05
  • @Pearson you can logically assume one common Gaussian distribution for all participants and another for all raters. The question is whether you will be able to fit a model to this set of data with 3 sets of completely separate participants and raters. Give it a try, as you are pooling information about the fixed effects. If that doesn't work, consider a hierarchical model in which each of the 3 separate sets of participants/raters is considered to be a higher-level random effect. Then you fit a random effect representing those 3 sets in addition to the participant and rater random effects. – EdM Nov 26 '21 at 22:25