2

We have a binomial process that yields samples of 60 trials. To save time, once 2 failures have been observed the process is reset.

So if a test series hits 2 failures early, the resultant sample ends up being truncated.

ex <- data.frame(FAIL = c(1,1,0,2,0,2), PASS = c(59,59,60,5,60,2))

Samples that have 2 failures could be anywhere between 2/58 to 2/2.

This causes a problem. Because I can't know if the 2/2 sample wouldn't have been maybe 5/55 if the series wasn't terminated early.

Population P for each sample is also different due to some changing IVs.

I'm having a hard time thinking about how to analyze these samples.

I know odds ratios would be valid but I have lots of zero count cells when stratifying 2x2 tables.

Is this the proper way to weight for changes in N/variance with GLM?

GLM(FAIL/PASS)~IV1+IV2+IV3, family=quasibinomial, ex)

I'm confused about when to use weights or offset or both.

e.g.

GLM(FAIL/PASS)~IV1+IV2+IV3+offset(FAIL+PASS), family=quasibinomial, ex)

or

GLM(FAIL/PASS)~IV1+IV2+IV3, weights=(FAIL+PASS), family=quasibinomial, ex)
kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Chad
  • 141
  • 3
  • 2
    If a trial terminates early, what you actually observe is the number of trials to the second failure, rather than the number of failures in a fixed number of trials. If all finished early, or none did, this would be easy (in the first case you could fit say a negative binomial model, in the second a straight binomial GLM). However, you may be able to treat it as a censored binomial response, perhaps. – Glen_b Apr 14 '15 at 00:48
  • 1
    One could treat it as a censored negative binomial, the response variable is the number of successes before the second failure, but if we do not reach 2 failures in 60 trials we only observe that the waiting time is $\ge 58$! (since the next trial *could be* the second fail. – kjetil b halvorsen Aug 01 '17 at 15:46

1 Answers1

2

Let $N_i$ be the number of successes observed before the second fail, which will have a negative binomial distribution. But if you reach 60 trials without seeing the second failure, the observation is censored, and you observe that $N_i \ge 59$ (if you have seen one failure so far) or $N_i \ge 60$ (otherwise). Then you can use likelihood methods with censored observations, there are many example on this site, see for example ML estimate of exponential distribution (with censored data)

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467