1

I've been going down the rabbit-hole of multiple-testing, and I haven't figured it out quite yet. So let's take a simplified example. Let's say I have 200 drugs and I want to test their effectiveness.

  • Option 1 is an ANOVA, but that only tests group differences.
  • Option 2 is ANOVA + a post-hoc pairwise tests with a correction like Scheffe. But since I'm testing SO MANY drugs, its so unlikely to find any of them significant at all.
  • Option 3 is a linear regression with where drug is a 199 term with a baseline control (-reg y i.drug). Then I do a multiple-test correction (-test drug, mtest(bonferroni)-). But this has the same problem as option 2 in that nothing will be found, AND it only compares the drugs to controls not to each other

How do I address this? I'm at a loss. Seems like either I fish for p-values with data mining or I do a correction that pretty much guarantees I find all null results. Neither of those seems reflective of reality.

Hutchins
  • 131
  • 1
  • 8
  • [False discovery rate](https://en.wikipedia.org/wiki/False_discovery_rate). Your title is a bit misleading: your question is about multiple comparisons, not specifically about pharmaco-epidemiology & clinical trials. Welcome to CV! – Alexis May 21 '20 at 15:47
  • What do you want to know about the 200 drugs? Need a clear statement of that before choosing type of _ad hoc_ test procedure. // A couple of options: (a) which one or two or ten seem most effective from data, and which if any is signif better than control? (b) identify all that are signif better than control? // Scheffe criterion may be too conservative. In (a) you'll be making only a few comparisons and maybe Tukey's HSD is best. In (b) there are ways to protect against false discovery only for comparisons of 199 vs control. // Maybe neither (a) nor (b) is what you intend. Then pls say what. – BruceET May 21 '20 at 22:42
  • How are you deciding what is best? Like I'm mainly curious about generally how to make these decisions – Hutchins May 22 '20 at 14:36
  • A study where you want to test ***200 (!)*** drugs is likely to be poor. The fact that options 2 & 3 are "unlikely to find any of them significant at all" is just making that fact clear. Rethink your study. If you really have no knowledge of the drugs that you can use to clarify your question, reduce the number of drugs investigated, and generally constrain the study, then [go do something else with your time](https://stats.stackexchange.com/a/321918/7290). I recognize that sounds harsh, but it is meant in your best interest: life is short, you don't want to waste it on projects like this. – gung - Reinstate Monica May 23 '20 at 15:30

0 Answers0