We are looking at tournament performance of chess players over time and have a question about the random effects modeling for this. Specifically, every chess player belongs to at least one club, but some of them can belong to two clubs. Players play in multiple tourneys per year. We would like to integrate this into the model, but it is not entirely clear to us how we would specify it in for example lme4 in R.
Edit: Data example just to illustrate the problem
df_chess <- data.frame(performance=c(5, 6.5, 4.5, 3.5, 4, 8, 9, 8.5, 7.5, 4, 3.5, 3.5, 3),
player=c("AD", "AD", "AD", "KA", "KA", "KA", "KA", "MM", "MM", "FR", "FR", "FR", "FR"),
year=c(2007, 2007, 2008, 2007, 2007, 2008, 2008, 2007, 2008, 2007, 2007, 2007, 2008),
club1=c("KNL", "KNL", "KNL", "KNL", "KNL", "KNL", "KNL", "BBH", "BBH", "BBH", "BBH", "BBH", "BBH"),
club2=c(NA, NA, NA, "GEO", "GEO", "GEO", "GEO", "KNL", "KNL", NA, NA, NA, NA))
performance player year club1 club2
1 5.0 AD 2007 KNL <NA>
2 6.5 AD 2007 KNL <NA>
3 4.5 AD 2008 KNL <NA>
4 3.5 KA 2007 KNL GEO
5 4.0 KA 2007 KNL GEO
6 8.0 KA 2008 KNL GEO
7 9.0 KA 2008 KNL GEO
8 8.5 MM 2007 BBH KNL
9 7.5 MM 2008 BBH KNL
10 4.0 FR 2007 BBH <NA>
11 3.5 FR 2007 BBH <NA>
12 3.5 FR 2007 BBH <NA>
13 3.0 FR 2008 BBH <NA>
Considering for simplicity just to model the random effect intercepts and without a nesting structure: A model with
fit <- lmer(performance ~ year + (1 | player) + (1 | club1) + (1 | club2))
would exclude all players that are not part of a second club.
Is there any way to include a second club random effect in this model when only a smaller subset of all players possess this random effect (in addition to the other club), without losing all the ones who only belong to one club?
Thank you!