First off, I have a dataset with sparse longitudinal data. There are 30 individuals with 1 sample, 30 individuals with 2 samples, and 5 individuals with 3 samples. Various categorical variables are known for each individual and I want to see if these variables are correlated with a drug level (a continuous variable). Let's just focus on one categorical variable: homelessness. The main issue is that the number of people who are homeless is not equal to those who are not so I cannot perform a simple wilcoxon signed rank test or most other paired tests. As a result, I generated a linear model to see the relationship between homelessness and the drug levels using a random slope/intercept for each individual and another for those who are not homeless. Of course, if I just perform an ANOVA(linearmodel1, linearmodel2) I get the result:
"all fitted objects must use the same number of observations".
Edit:
As pointed in comment by @Roland (see link below in comments), one approach is to combine the data and make 2 models: 1 with the variable homelessness and 1 without. Using polynomial regression this can be done with:
###Create some example data
mydata1 <- subset(iris, Species == "setosa", select = c(Sepal.Length, Sepal.Width))
mydata2 <- subset(iris, Species == "virginica", select = c(Sepal.Length, Sepal.Width))
#add a grouping variable
mydata1$g <- "a"
mydata2$g <- "b"
#combine the datasets
mydata <- rbind(mydata1, mydata2)
#model without grouping variable
fit0 <- lm(Sepal.Width ~ poly(Sepal.Length, 2), data = mydata)
###model with grouping variable
fit1 <- lm(Sepal.Width ~ poly(Sepal.Length, 2) * g, data = mydata)
#Compare models
anova(fit0, fit1)
enter code here
#But this doesnt work in nlme
fit1 <- lme(Sepal.Width ~ Sepal.Length * g, data=mydata)
#It throws an error:
"invalid formula for groups"
######Not sure if this is the correct way
###nlme
#model without grouping variable
model0 = gls(Sepal.Width ~ Sepal.Length,data=mydata)
#model with grouping variable
model1 = lme(Sepal.Width ~ Sepal.Length ,random = ~1|g,data=mydata)
anova(model0,model1)
###lme4
#model without grouping variable
fm0 <- lm(Div ~ TimeRaw,ddmerged)
#model with grouping variable
fm1 <- lmer(Sepal.Width ~ Sepal.Length+(1|g),mydata, REML=FALSE)
anova(fm0,fm1)
But how do I create two models with and without a specific group using nlme/lme4?
Thanks in advance