This is my first post so please bear with me. I do not have statistics background and I am still learning my way around so your answers will be very helpful. I am using GEE in R to compare the levels of a categorical predictor (Gp) with 2 levels on a binary response (Resp). The data comes from repeated measurements from a set of people (ID). I am interested in obtaining the odds ratio to compare these 2 levels of the group. This is for my work so I have created a dataset that can reproduce the error below:
library(gee)
set.seed(50)
# Creating a dataset
df <- data.frame("Resp" = rep(0,20), "Gp" = rep(c("a","b"),20))
df <- df[order(df$Gp),]
df[1:10,"ID"] <- "P1"
df[11:20,"ID"] <- "P2"
df[21:30,"ID"] <- "P3"
df[31:40,"ID"] <- "P4"
df[c("Gp","ID")] <- lapply(df[c("Gp","ID")],as.factor)
# Creating first dataframe with all responses from group a as 0
df1 <- copy(df)
df1[25:35,1] <- 1
table(df1$Gp,df1$Resp)
0 1
a 20 0
b 9 11
All responses from Group a are 0. Now if I try to run the GEE, it ends with an error:
a <- gee(Resp ~ Gp, id= ID , data= df1, family=binomial(link=logit),na.action=na.omit,corstr = "exchangeable")
Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27
running glm to get initial regression estimate
(Intercept) Gpb
-19.56607 19.76674
Error in gee(Resp ~ Gp, id = ID, data = df1, family = binomial(link =
logit), : Cgee: error: logistic model for probability has fitted value very
close to 1. estimates diverging; iteration terminated.
However, if I edit the same dataset so that some of the responses of level a are 1, the model runs without any issues:
df2 <- copy(df)
df2[17:32,1] <- 1
table(df2$Gp,df2$Resp)
0 1
a 16 4
b 8 12
b <- gee(Resp ~ Gp, id= ID , data= df2,
family=binomial(link=logit),na.action=na.omit,corstr = "exchangeable")
Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27
running glm to get initial regression estimate
(Intercept) Gpb
-1.386294 1.791759
Am I right in assuming that this error is occurring because all responses are 0 for group A? Is it not possible to compare the levels of a group in this scenario since we cannot use chi sq or fisher test either?