I have data for a time it takes an animal to fall off a rod (the longer the time the better the performance), which I collected three times a day over five consecutive days for the following balanced experimental design: my animals are from three different genotypes, where in each genotype I have six different animals: three males and three females.
Here's an R
code for generating an example data:
library(mvtnorm)
set.seed(1)
df <- data.frame(genotype = c(rep("a",90),rep("b",90),rep("c",90)),
animal = c(rep("a1",15),rep("a2",15),rep("a3",15),rep("a4",15),rep("a5",15),rep("a6",15),
rep("b1",15),rep("b2",15),rep("b3",15),rep("b4",15),rep("b5",15),rep("b6",15),
rep("c1",15),rep("c2",15),rep("c3",15),rep("c4",15),rep("c5",15),rep("c6",15)),
sex = rep(c(rep("M",45),rep("F",45)),3),
day = rep(c(rep(1,3),rep(2,3),rep(3,3),rep(4,3),rep(5,3)),18),
time = c(c(rmvnorm(3,log(1:5),matrix(c(0.5),5,5))),
c(rmvnorm(3,log(1:5),matrix(c(0.5),5,5))),
c(rmvnorm(3,log(1:5),matrix(c(0.5),5,5))),
c(rmvnorm(3,log(1.2:5.2),matrix(c(0.5),5,5))),
c(rmvnorm(3,log(1.2:5.2),matrix(c(0.5),5,5))),
c(rmvnorm(3,log(1.2:5.2),matrix(c(0.5),5,5))),
c(rmvnorm(3,log(1.2:5.2),matrix(c(0.5),5,5))),
c(rmvnorm(3,log(1.2:5.2),matrix(c(0.5),5,5))),
c(rmvnorm(3,log(1.2:5.2),matrix(c(0.5),5,5))),
c(rmvnorm(3,log(1.3:5.3),matrix(c(0.5),5,5))),
c(rmvnorm(3,log(1.3:5.3),matrix(c(0.5),5,5))),
c(rmvnorm(3,log(1.3:5.3),matrix(c(0.5),5,5))),
c(rmvnorm(3,log(0.8:4.8),matrix(c(0.5),5,5))),
c(rmvnorm(3,log(0.8:4.8),matrix(c(0.5),5,5))),
c(rmvnorm(3,log(0.8:4.8),matrix(c(0.5),5,5))),
c(rmvnorm(3,log(1:5),matrix(c(0.5),5,5))),
c(rmvnorm(3,log(1:5),matrix(c(0.5),5,5))),
c(rmvnorm(3,log(1:5),matrix(c(0.5),5,5)))),
stringsAsFactors = FALSE)
The use of rmvnorm
is for the autocorrelation that exists in the task times of the animals.
The use of log
is to obtain values that reflect a growth/learning curve over the five day period which is supposed to reflect the learning curve the animals go through in this task.
What I'd like to test is whether there is a difference in the slope of the curves between the three genotypes, accounting for the sex of the animal (sex by genotype interaction is assumed to be negligible and thus should not be modelled).
I thought I'd use the lavaan
R
package. I transformed my df
to a table format, where each row specifies an animal, its genotype, sex, and the means and variances of its three task times in each day:
df.tab <- do.call("rbind",lapply(unique(df$animal), function(x){
data.frame(genotype = unique(df$genotype[which(df$animal == x)]),
animal = x,
sex = unique(df$sex[which(df$animal == x)]),
day1.m = mean(df$time[which(df$animal == x & df$day == 1)]),
day2.m = mean(df$time[which(df$animal == x & df$day == 2)]),
day3.m = mean(df$time[which(df$animal == x & df$day == 3)]),
day4.m = mean(df$time[which(df$animal == x & df$day == 4)]),
day5.m = mean(df$time[which(df$animal == x & df$day == 5)]),
day1.e = var(df$time[which(df$animal == x & df$day == 1)]),
day2.e = var(df$time[which(df$animal == x & df$day == 2)]),
day3.e = var(df$time[which(df$animal == x & df$day == 3)]),
day4.e = var(df$time[which(df$animal == x & df$day == 4)]),
day5.e = var(df$time[which(df$animal == x & df$day == 5)]),
stringsAsFactors = FALSE)}))
df.tab$genotype = as.factor(df.tab$genotype)
df.tab$animal = as.factor(df.tab$animal)
df.tab$sex = as.factor(df.tab$sex)
Trying to fit a lavaan
growth
model to these data, where to start with, I only use the task time means:
library(lavaan)
model <- '
i =~ 1*day1.m + 1*day2.m + 1*day3.m + 1*day4.m + 1*day5.m
s =~ 0*day1.m + 1*day2.m + 2*day3.m + 3*day4.m + 4*day5.m
# regressions
i ~ genotype + sex
s ~ genotype + sex
'
fit <- growth(model, data=df.tab)
Unfortunately I get these warnings:
Warning messages:
1: In lav_data_full(data = data, group = group, group.label = group.label, :
lavaan WARNING: unordered factor(s) with more than 2 levels detected in data: genotype
2: In lav_samplestats_from_data(lavdata = lavdata, missing = lavoptions$missing, :
lavaan WARNING: sample covariance can not be inverted
3: In lav_object_post_check(lavobject) :
lavaan WARNING: some estimated variances are negative
4: In lav_object_post_check(lavobject) :
lavaan WARNING: observed variable error term matrix (theta) is not positive definite; use inspect(fit,"theta") to investigate.
I'd appreciate your help.