1

Imagine someone sampled some chimpanzees, each chimpanzee was given two puzzle boxes to open to obtain a food reward (boxes A and B), for each case the person recorded success or failure to get the food reward. I only have the totals for each outcome – a total of (a) chimps opened both boxes, (b) opened B but not A, (c) opened A but not B, and (d) chimps opened neither. I want to test if the two box types are difficult to open or not. I used the following code:

a <- 16    # chimps who opened both boxes
b <- 16    # chimps who opened B but not A
c <- 29    # chimps who opened A but not B
d <- 17    # chimps who opened neither box

n <- a+b+c+d

chimp_ID    <- c(rep(1:n, each=2))
environment <- c(rep(0:1, times=n))
a_chimps    <- c(rep(0:0, times=a))
b_chimps    <- c(rep(1:0, times=b))
c_chimps    <- c(rep(0:1, times=c))
d_chimps    <- c(rep(1:1, times=d))

event <- c(a_chimps,b_chimps,c_chimps,d_chimps)
 
library(lme4)
summary(glmer(event ~ environment + (1 | chimp_ID), family=binomial, nAGQ=17))

This seems to work fine and give rational answers except if I have two zeros in the data set, imagine that no chimps managed to open box B, so that a = b = 0, now the model always fails to converge. Can anyone suggest a solution to this?

  • 1
    But, if a = b = 0 so all events should be in c. The d event is not possible, chimps cannt open B because b = 0, have only option event c, open box a. –  Feb 23 '21 at 17:01
  • My mistype - option d should be chimpanzees that opened neither box - so if no chimps can open box b - then both c and d remain possibilities – Graeme Ruxton Feb 26 '21 at 15:56
  • There are a couple of wonky things about your example. `0:0` and `1:1` don't work the way you think they do (you need `c(0,0)` or `rep(0,2)` and similarly for the other case). Can you clarify that your example above is otherwise correct (i.e. you've edited since the comments and `a=b=0` is now the correct case to check? – Ben Bolker Mar 08 '21 at 21:37
  • Sorry - the case to check is a = b = 0 and the code I was using was as follows: – Graeme Ruxton Mar 12 '21 at 21:21

2 Answers2

1

Just dumping this here for now: I think there's nothing wrong in principle with doing a likelihood ratio test even if there is complete separation in the fitted model ...

mkdata <- function(a=16, # opened both boxes
                   b=16, # opened B not A
                   c=29, # opened A not B
                   d=17) # opened neither
{
    n <- a+b+c+d
    dd <- data.frame(chimp_ID = rep(1:n, each=2),
                     environment = rep(0:1, times=n),
                     event=c(
                         rep(rep(0,2), times=a),
                         rep(1:0, times=b),
                         rep(0:1, times=c),
                         rep(rep(1,2), times=d)))
    return(dd)
}


library(lme4)
g1 <- glmer(event ~ environment + (1 | chimp_ID), family=binomial, nAGQ=17,
            data=mkdata())
mcnemar.test(matrix(c(16,16,29,17),byrow=TRUE,nrow=2))
drop1(g1,test="Chisq")

g2 <- update(g1,data=mkdata(0,0,20,20))
mcnemar.test(matrix(c(0,0,20,20),byrow=TRUE,nrow=2))
drop1(g2,test="Chisq")
Ben Bolker
  • 34,308
  • 2
  • 93
  • 126
0

This is a classic situation for McNemar's test. (To learn more about it, it may help you to read my answers here and here.)

You have the following $2\times 2$ tables:

> tab
   A
B    A  B
  A 16 16
  B 29 17

> tab2
     A
B     yes no
  yes   0  0
  no   29 17

You test either with McNemar's test:

> mcnemar.test(tab)

    McNemar's Chi-squared test with continuity correction

data:  tab
McNemar's chi-squared = 3.2, df = 1, p-value = 0.07364

> mcnemar.test(tab2)

    McNemar's Chi-squared test with continuity correction

data:  tab2
McNemar's chi-squared = 27.034, df = 1, p-value = 1.999e-07
gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
  • Sorry to be difficult - I do know McNemar's test - my ultimate plan is to compare the performance of this model-fitting approach with McNemar's test. – Graeme Ruxton Feb 24 '21 at 18:52
  • @GraemeRuxton, there isn't going to be a simple solution to the non-convergence outside of McNemar's test. You have perfect separation. Various approaches to separation exist (see [here](https://stats.stackexchange.com/q/11109/)); you could try a Bayesian approach, but I don't see how that makes for straightforward comparisons w/ McNemar's test. – gung - Reinstate Monica Feb 24 '21 at 18:57