I am studying some Score in a population of young (Age=1
) and old people (Age=2
). Each person was studied several times (1-4). Score was measured before (Sweets=1
) and after (Sweets=2
) eating the sweet. Some older people have grandchildren (hasGrandchildren=1
). I assume that those who have grandchildren are biologically younger by the delta
value.
I get this model:
k[Sweets]*(Age+delta*hasGrandchildren)+I[subjectID]
The code to generate fake data:
library(tidyverse)
# define parameters
k_mu = c(-1,-2);
k_sigma = 0.1;
delta_mu = -0.5;
delta_sigma = 0.1;
I_mu = 10;
I_sigma = 1;
# 100 subjects were studied
subjectID <- seq(1:10^3)
Age <- sample(1:2,10^3,replace=TRUE) # 1 - young, 2 - old
# Only old have grandchildren
hasGrandchildren <-rep(0,10^3)
nOld <- sum(Age>1)
hasGrandchildren[Age>1] <-sample(c(0,1),nOld,replace = TRUE)
# generate Intercepts
I <-rnorm(10^3,I_mu,I_sigma)
# Get measured population
popDF<- tibble(subjectID,Age,hasGrandchildren,I)
# Everybody was tested before and after sweets
Sweets <-c(rep(1,10^3),rep(2,10^3))
popDF <-rbind(popDF,popDF)
popDF$Sweets <- Sweets
# on average every body was tested 4 times
expDF <-sample_n(popDF,4*10^3,replace=TRUE)
expDF$k<-rnorm(4*10^3,k_mu[expDF$Sweets],k_sigma)
expDF$delta<-rep(0,4*10^3)
expDF$delta[expDF$hasGrandchildren==1]<-rnorm(sum(expDF$hasGrandchildren==1),delta_mu,delta_sigma)
#######
# Model: Score = k[Sweets]*(Age+delta*hasGrandchildren)+I[subjectID]
expDF$Score <- expDF$k*(expDF$Age+expDF$delta*expDF$hasGrandchildren)+I
My aim is to estimate unobserved parameters k and delta.
I may do it successfully with rethinking or rstan package:
library(rethinking)
m1 <-ulam(
alist(
Score ~dnorm(mu,sigma),
mu<- k_est[Sweets]*(Age+delta_est*hasGrandchildren)+I[subjectID],
k_est[Sweets] ~ normal(-1,1),
I[subjectID] ~ normal(10,1),
delta_est ~ normal(0,1),
sigma ~ dexp(1)
),data=expDF,chains=2,cores=2,declare_all_data=FALSE)
Maybe I have not generated the fake data well (I don't have much experience with R), but the above model works fine on real data.
My problem is that I need to try to do without Bayesian statistics and use Linear Mixed-Effects Models, such as the lme4 library. I might be a complete idiot, but I don't understand how formulas are arranged in such packages. Reading the forums (here and here) didn't help.
Please help me to write Linear Mixed-Effects Models for my case.