0

I am measuring 2 responses in patients from different age cohorts. Each response is negatively correlated with age. This gives me a positive correlation of ResponseX and ResponseY in a total population. For each given age group ResponseX and ResponseY are negatively correlated. Moreover, the slope of this correlation is Age-dependent. Below you will find the code to generate fake data very similar to what I have:

library('ggplot2')
library(tidyverse)
library(lme4)

# average X and Y depends from Age
k_mu <- c(-5,-5)
i_mu <- c(1,2)

# inside each group X and Y are linearly dependent
k_grp <- -0.5
i_grp <- 0

# Generate Ages
N = 10^2
Age <-sample(0:2,N,replace = TRUE)
# add correlations inside groups
k <-k_grp*Age+i_grp
x <- rnorm(N,mean=0,sd=1)
y <- k*x+rnorm(N,mean=0,sd=0.5)
x<-x+k_mu[1]*Age+i_mu[1]
y<-y+k_mu[2]*Age+i_mu[2]
age<-factor(Age,ordered = TRUE)
data = data.frame(x,y,age)

If I will plot the glm regressions it looks pretty nice:

ggplot(data, aes(x = x, y = y, color = age) ) +
  geom_point() +
  geom_smooth(method = "lm", se = T)+
  geom_smooth(group=1,method = "glm", se = T,'color'='black')+
  labs(x='ResponseX', y='ResponseY', color='Age')

enter image description here

Now I want to find equations for all 4 lines from this plot and equation of dependence of ingroup slopes from age. To do this I tried to use lmer, but I could not find a solution. The following code gives me an error:

lmer(y ~ 1+age + (1+x|age), data = data) %>% summary()

Could you help me to model my data?

zlon
  • 639
  • 4
  • 20

1 Answers1

1

You could/should include x as fixed effects and an interaction x:age in your model. Please be aware, that I removed the order from your variable age, so that now it is a simple unordered factor:

age<-factor(Age)
data = data.frame(x,y,age)
m2=lmer(y ~ 1 + x + age + x:age + (1|age), data = data)
summary(m2)

This will give you the the slopes of each age-group as coefficients:

Linear mixed model fit by REML ['lmerMod']
Formula: y ~ 1 + x + age + x:age + (1 | age)
   Data: data

REML criterion at convergence: 151.3

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.03703 -0.75784 -0.09648  0.64946  2.94535 

Random effects:
 Groups   Name        Variance Std.Dev.
 age      (Intercept) 6.1309   2.4761  
 Residual             0.2341   0.4838  
Number of obs: 100, groups:  age, 3

Fixed effects:
             Estimate Std. Error t value
(Intercept)   1.88845    2.47930   0.762
x             0.12595    0.09123   1.381
age1         -6.85007    3.52032  -1.946
age2        -18.37227    3.57822  -5.134
x:age1       -0.62312    0.12065  -5.165
x:age2       -1.05771    0.12099  -8.742

Correlation of Fixed Effects:
       (Intr) x      age1   age2   x:age1
x      -0.039                            
age1   -0.704  0.028                     
age2   -0.693  0.027  0.488              
x:age1  0.030 -0.756  0.040 -0.021       
x:age2  0.030 -0.754 -0.021  0.112  0.570

So essentially, the output (fixed effects) can be interpreted as follows:

the intercept and slope for your first level of age is 1.8885 and 0.12595, respectively. The corresponding response y can be depicted below as the black trajectory given as

y = 1.8885 + x*0.12595

For the second level of age the intercept and slope is given as 1.8885 + (-6.850) and 0.12595 + (-0.62312), respectively. The corresponding response y is shown as the red trajectory given as

y = 1.8885 + (-6.850) + x*(0.12595-0.62312)

For the third level of age the intercept and slope is given as 1.8885 + (-18.37227) and 0.12595 + (-1.05771), respectively. The corresponding response y is shown as the green trajectory given as

y = 1.8885 + (-18.37227) + x*(0.12595-1.05771)

The plot:

plot(y ~ x, data, col=age)
lines(x=c(-20:20), y=1.888 + (-20:20)* 0.126)
lines(x=c(-20:20), y=1.888 -6.85 + (-20:20)* (0.126-0.623), col=2)
lines(x=c(-20:20), y=1.888 -18.37 + (-20:20)* (0.126-1.0577), col=3)

Your data with age-dependent slopes

As to your question Where is the slope and intercept of the black line (regression of average per age): I would't know how to do this with age as random effects in the model. The easiest way I can think of to get the average intercept and slope across all levels combined would be simple linear regression

lm(y ~ x, data)
RomanS
  • 46
  • 6
  • Thank you for the answer. But I am still a bit confused... Where is the slope and intercept of the black line (regression of average per age) and where is k_grp (dependence of slope from age) in this output? – zlon Jan 26 '22 at 13:06
  • Thinking about the model, I have to mention that it might be problematic to use a random effect with only 3 levels... according to @BenBolker it should be minimum 5 or 6 (see [https://stats.stackexchange.com/a/37665/343024]). – RomanS Jan 26 '22 at 15:04
  • Also, I forgot to include x as fixed effect. Sorry about the mess. I will refit the model with x as fixed effect and describe the output. ...I will edit my response accordingly. – RomanS Jan 26 '22 at 15:26