2

A design has two predictor variables (both continuous). The response is pass/fail. I would like to know how to use Binary Logistic Regression to know what value of each variable will give me a failure rate of 0.000001 with 95% confidence.

With one predictor variable, it's clear - but not with two. I usually use Minitab, but I would be limited to holding one predictor variable constant while varying the other - this could be done, but is tedious (and isn't really what I want to do).

I also don't want to overdesign.

I don't know if this can be done in Excel or R, but that would be good to know.

In many ways, I'm looking for the creation of a surface, where the predictor variables are on the "x" and "y" axes, and the probability is on the "z" axis. I would want the lower 95th% confidence of this surface meeting a failure rate of 0.000001.

Any responses are much appreciated.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
  • 3
    Unless you have an extraordinary amount of data, it's possible there will be no solution to this problem at all, because estimating a one-per-million rate is quite an extrapolation. Absent millions of data, you would need a region of intermediate values of the predictor variables where varying proportions of passes are observed, separating a region of all passes from another of all fails. You wouldn't be able to check for continued linearity of the response out in these regions, making the 95% confidence rather illusory. – whuber Dec 16 '11 at 21:16
  • Even though this requirement is very demanding, I think that with enough margin - as you say - the requirement could be shown. – Jay Greenstein Dec 16 '11 at 21:46
  • 1
    This perhaps makes my point more clearly, then: http://stats.stackexchange.com/a/4968. – whuber Dec 17 '11 at 22:37
  • There exists wonderful binomial confidence interval calculators. I like the one from the Australian veterinary folks (ausvet). Instead of designing for prevalence you should think about designing for the worse, not worst case. When you look at how many samples are required, and the rate of success required, that gives you a sense of how reasonable or unreasonable those rates can be. https://epitools.ausvet.com.au/ciproportion – EngrStudent Apr 13 '21 at 15:31

1 Answers1

0

Partially answered in comments:

Unless you have an extraordinary amount of data, it's possible there will be no solution to this problem at all, because estimating a one-per-million rate is quite an extrapolation. Absent millions of data, you would need a region of intermediate values of the predictor variables where varying proportions of passes are observed, separating a region of all passes from another of all fails. You wouldn't be able to check for continued linearity of the response out in these regions, making the 95% confidence rather illusory.

This perhaps makes my point more clearly, then: http://stats.stackexchange.com/a/4968.

  • whuber
kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467