Does running multiple similar models lead to p-hacking?

Question

In my analysis I am testing a lot of similar models. I have two variables (let´s call them A and B) and a whole bunch of other variables (C1....C10). All models have the same approach: It is always looking whether there is an interaction effect Between A or B with one of the C1..C10. So for example:

model 1: interaction A with C1
model 2: interaction A with C2
...
model 10: interaction A with C10
model 11: interaction B with C1
model 12: interaction B with C2
...
model 20: interaction B with C10

I can´t combine all the C variables because they are sometimes quite similar and therefore I would expect them to cause multicollinearity. Combining A and B would lead to a lot of NAs. For those two reasons I separated the models. On the other hand I now have 20 different results and I am not sure how to interpret them correctly. My approach would be to look at the p-values (as it is an easy criterion) and assert that those with a low p-value probably show some correlation (have a significant interaction effect) whereas models with high p-values appear not to have significant interaction effects and show contrary results to what I expected.

Is this approach OK?
I do not omit any results. I will show those results with high p-values as well. As far as I understood "p-hacking", it is similar to "cherry-picking". I therefore should not encounter such problems or am I wrong?

score 2 · Accepted Answer · answered Sep 21 '19 at 12:59

2

It seems fine. Just be sure to do something about multiple testing. Bonferroni is the easiest: divide your alpha threshold by the number of comparisons, so if you want to test 20 times at 0.05, you must have a p-value of 0.0025 to deem a result statistically significant. Bonferroni, however, is about as conservative as it gets, and other techniques are worth exploring, particularly Holm–Bonferroni and Benjamini-Hochberg.
You’re fine. P-hacking would be the XKCD about jellybeans where they only report the one significant result after not making any corrections for multiple testing. I’m struggling to post the cartoon from my phone, but you can Google “XKCD jellybeans” to find it.

answered Sep 21 '19 at 12:59

Dave

28,473
4
52
104

all right, thank you. I also assume that a whole bunch of models from a theoretical point should not provide much evidence. I would therefore just select those models I assume to be important at first. By doing so, the Bonferroni correction would mean I just have to divide by the number of those relevant models and I would not distort the threshold too much. Is this possible, too? – heyho Sep 22 '19 at 06:11
I then would also "add" the remaining models (just list them up) just for completeness but would abstain from making any interpretations. Thus, the Bonferroni correction would not be affected. Is this correct as well? – heyho Sep 22 '19 at 06:22
1

@heyho You have to decide which comparisons you want to consider before you look at the data. If you were only ever interested in three of the 20, then only look at those three. – Dave Sep 22 '19 at 14:49

Does running multiple similar models lead to p-hacking?

1 Answers1