truncated binomial samples with GLM

Question

We have a binomial process that yields samples of 60 trials. To save time, once 2 failures have been observed the process is reset.

So if a test series hits 2 failures early, the resultant sample ends up being truncated.

ex <- data.frame(FAIL = c(1,1,0,2,0,2), PASS = c(59,59,60,5,60,2))

Samples that have 2 failures could be anywhere between 2/58 to 2/2.

This causes a problem. Because I can't know if the 2/2 sample wouldn't have been maybe 5/55 if the series wasn't terminated early.

Population P for each sample is also different due to some changing IVs.

I'm having a hard time thinking about how to analyze these samples.

I know odds ratios would be valid but I have lots of zero count cells when stratifying 2x2 tables.

Is this the proper way to weight for changes in N/variance with GLM?

GLM(FAIL/PASS)~IV1+IV2+IV3, family=quasibinomial, ex)

I'm confused about when to use weights or offset or both.

e.g.

GLM(FAIL/PASS)~IV1+IV2+IV3+offset(FAIL+PASS), family=quasibinomial, ex)

or

GLM(FAIL/PASS)~IV1+IV2+IV3, weights=(FAIL+PASS), family=quasibinomial, ex)

If a trial terminates early, what you actually observe is the number of trials to the second failure, rather than the number of failures in a fixed number of trials. If all finished early, or none did, this would be easy (in the first case you could fit say a negative binomial model, in the second a straight binomial GLM). However, you may be able to treat it as a censored binomial response, perhaps. — Glen_b, Apr 14 '15 at 00:48
One could treat it as a censored negative binomial, the response variable is the number of successes before the second failure, but if we do not reach 2 failures in 60 trials we only observe that the waiting time is $\ge 58$! (since the next trial *could be* the second fail. — kjetil b halvorsen, Aug 01 '17 at 15:46

score 2 · Answer 1 · answered Aug 01 '17 at 15:52

Let $N_i$ be the number of successes observed before the second fail, which will have a negative binomial distribution. But if you reach 60 trials without seeing the second failure, the observation is censored, and you observe that $N_i \ge 59$ (if you have seen one failure so far) or $N_i \ge 60$ (otherwise). Then you can use likelihood methods with censored observations, there are many example on this site, see for example ML estimate of exponential distribution (with censored data)

truncated binomial samples with GLM

1 Answers1