What does it mean when two random effects are highly or perfectly correlated?
That is, in R when you call summary on a mixed model object, under "Random effects" "corr" is 1 or -1.
summary(model.lmer)
Random effects:
Groups Name Variance Std.Dev. Corr
popu (Intercept) 2.5714e-01 0.5070912
amdclipped 4.2505e-04 0.0206167 1.000
nutrientHigh 7.5078e-02 0.2740042 1.000 1.000
amdclipped:nutrientHigh 6.5322e-06 0.0025558 -1.000 -1.000 -1.000
I know this is bad and indicates that the random effects part of the model is too complex, but I'm trying to understand
- 1)what is doing on statistically
- 2)what is going on practically with the structure of the response variables.
Example
Here is an example based on "GLMMs in action: gene-by-environment interaction in total fruit production of wild populations of Arabidopsis thaliana" by Bolker et al
Download data
download.file(url = "http://glmm.wdfiles.com/local--files/trondheim/Banta_TotalFruits.csv", destfile = "Banta_TotalFruits.csv")
dat.tf <- read.csv("Banta_TotalFruits.csv", header = TRUE)
Set up factors
dat.tf <- transform(dat.tf,X=factor(X),gen=factor(gen),rack=factor(rack),amd=factor(amd,levels=c("unclipped","clipped")),nutrient=factor(nutrient,label=c("Low","High")))
Modeling log(total.fruits+1) with "population" (popu) as random effect
model.lmer <- lmer(log(total.fruits+1) ~ nutrient*amd + (amd*nutrient|popu), data= dat.tf)
Accessing the Correlation matrix of the random effects show that everything is perfectly correlated
attr(VarCorr(model.lmer)$popu,"correlation")
(Intercept) amdclipped nutrientHigh amdclipped:nutrientHigh
(Intercept) 1 1 1 -1
amdclipped 1 1 1 -1
nutrientHigh 1 1 1 -1
amdclipped:nutrientHigh -1 -1 -1 1
I understand that these are the correlation coefficients of two vectors of random effects coefficients, such as
cor(ranef(model.lmer)$popu$amdclipped, ranef(model.lmer)$popu$nutrientHigh)
Does a high correlation mean that the two random effects contain redundant information? Is this analogous to multicollinearity in multiple regression when a model with highly correlated predictors should be simplified?