Accounting for variation between populations when comparing failure rates

Question

Here's my situation. The big picture is that I need to figure out how to compare the failure rates of two groups of widgets, Group A (defect) and Group B (control), in such a way that answers the question, "What's the impact on failure rates of having the defect (or being in Group A) relative to the control (Group B)?" Essentially, I'm trying to explain how much of the difference between the failure rate of A and the failure rate of B is explained by Group Membership and not by other influences in the data...

However, where I'm getting hung up is that both groups are not identically composed in terms of various attributes that could also influence rates. For example, Group A might have 20% built in January, 10% built in February, 40% built in March, and 30% built in April. But Group B might have 10% built in January, 30% built in February, 20% built in March, and 40% built in April.

So, if there was some build issue in February that made that batch worse than others, Group B would have a higher influence of that issue than Group A, which would muddy the "pure" comparison between Group A and B (i.e. defect vs. no defect). So it seems like you'd have to adjust for the different proportions in build month so that both groups had the same influence from each month...

I have 10-12 attributes like this that I want to control/adjust for in my study, but I'm not really sure the right way to do it or if there's a common method that's used to account for composition differences in such a comparison. (Edit -- all of my attributes are discrete) What I've tried so far is stratifying across these 10-12 attributes and comparing the rate for Group A vs. Group B for each stratum, but there are a couple of problems with this...

First, often the strata become very small, so the comparisons are not as strong. Second, I might end up with a thousand strata and I'm not really sure how to meaningfully answer the question (What's the effect of the defect on failure rates?) when some show A worse than B, some show B worse than A and some show no difference.

So, at the end of the day, I'm trying to isolate the influence of the defect on failure rates by somehow equalizing the influence of other attributes. I'm hoping that the community can help point me to some common methods for doing this and/or can help me better understand whether or not this is even an issue to be concerned about (it seems like an important one to me, but maybe I'm over-thinking it.)

This sounds like a variable selection problem, if I understand your question. Are you running a regression and including all of your 10-12 attributes (as factors?)? Or are you looking many many differences in means? If you're doing the latter, a more meaningful way to answer your key question about difference in group A vs B is to regress your outcome on an indicator for treatment, and then pick which other variables to use as controls. [This](http://stats.stackexchange.com/questions/130277/large-data-variable-selection) response to a question about variable selection might be useful. — Molly OW, Sep 26 '15 at 19:35

Accounting for variation between populations when comparing failure rates

0 Answers0