Statistical design for failure analysis

Question

I am designing a test plan to find factors that correlate with pavement failure. Failure is a rare occurrence. How would I go about designing the test?

I have N factors, with various factor levels, that I’d like to know the effects of. This is an observational study. I can’t apply factor treatments in a laboratory, but I can select roadway locations based on factors levels.

I see these as my options…

Option 1: Use a balanced matrix with the N factors and take my random samples from those subpopulations. Since failure is rare, I will probably end up with a very small sample of roadways with failures say 5% or less of the total sample.
Option 2: First identify roadways that are failed, then take 50% of my samples from the failed population, and then the other 50% from random non-failed roadways. I think this approach is better, but then I have little or no control over the factor levels and I won’t have a “balanced” data set when I run the statistics. Would I do typical MANOVA on the data?
Option 3: Mixture of both, somehow. E.g., dedicate minimal sample size to address certain factor levels/combinations, ones that are presumed to be less influential, then dedicate the rest of the samples to the 50%-50% split of failed and non-failed samples.

What am I missing?

Could you give some more info? What are your factors? N=? ? Your option 2 is called (at least in epidemiology) a [tag:case-control-study] and that could be a good option! — kjetil b halvorsen, Nov 17 '21 at 01:31
We have 5 factors that we can control, most are nominal, and many are nested factors. I won't go into all the details, but they include things like Pavement Type, Pavement Surface Mix Type, Maintenance Treatment Material, Surface Type, Joint Location. There are lots of covariates that we will measure, but not control. — Bryan, Nov 17 '21 at 14:19

Statistical design for failure analysis

0 Answers0