In a meta-analysis, how should one handle non-significant studies containing no raw data?

Question

Let's say that I'm conducting a meta-analysis, looking at the performance of group A and group B with respect to a certain construct. Now, some of the studies that I'll come across will report that no statistical differences could be found between the two groups but no exact test statistics and/or raw data will be presented. In a meta-analysis, how should I handle such studies?

Basically, I see three different alternatives here:

Include them all and assign to each one of them an effect size of 0.
Throw them all out.
Do some kind of power analysis for each one of them or set a threshold at a certain number of participants. Include all which should have been able to reach statistical significance and assign to each one of them an effect size of 0. Throw the rest out.

I can see merits with all the different options. Option one is fairly conservative and you'll only risk making a type II error. Option two raises the risk for making a type I error, but it also avoids having your results ruined because of a bunch of underpowered studies. Option three seems like the middle road between option one and option two, but a lot of assumptions and/or pure guesses will have to be made (What effect size should you base your power analyses on? What number of participants should you demand from each study for it to pass?), probably making the final result less reliable and more subjective.

Don't assign zero, because that's an underestimate of the effect. One approach is to assign a value of effect size which is associated with a p-value of 0.5 (the expected p-value if the null hypothesis is true). — Jeremy Miles, Mar 25 '14 at 05:01
What would make you even take a glance at a $P$-value when deciding what to do with a study when including in a meta-analysis? Think *estimation* not *hypothesis testing*. — Frank Harrell, Mar 25 '14 at 13:32
@JeremyMiles "Don't assign zero, because that's an underestimate of the effect." - Is it? I mean, it *could* be, but since there's no available data, I simply can't know what the true effect for these studies are. — Speldosa, Mar 25 '14 at 14:29
@FrankHarrell Studies reporting non-significant results gives you a hint of what the actual effect size for that study might be. Studies not reporting anything at all are totally useless and only introduces noise unless one assumes that there's a bias towards not reporting anything (not even a failed statistical test) when there isn't an effect present (maybe one should assume this to be the case?). The question I'm investigating for my meta-analysis is a question that's very often not the main question (or not even a question the authors are asking at all) of the studies that I'm looking at. — Speldosa, Mar 25 '14 at 15:45
I'm not sure if you got my point. I would never be tempted to look at a $P$-value when deciding to include a study in the meta-analysis. Non-"significant" studies need to be given a weight that only depends on their precision/sample size, not on any observed effect. — Frank Harrell, Mar 25 '14 at 15:52
@FrankHarrell - but to enter them as a datapoint into the meta-analysis, don't you need an effect size? — Jeremy Miles, Mar 25 '14 at 16:18
@FrankHarrell If 99% of my studies would be studies with n=2 and non-significant differences (remember, that's *all* the information I'm getting about the dependent variable), it would be madness to include them into my meta-analysis. However, excluding studies with n=100000 and non-significant differences (again, *zero* other information about the dependent variable) would be equally mad. And I need to draw the line somewhere, right? (I'm probably still not getting your point :P) — Speldosa, Mar 25 '14 at 16:39
You are right that there are studies that are so tiny that the weight they would get would make them be ignored anyway. But there are many situations where the variation in sample sizes is not as extreme as what you stated. Any study should provide an effect estimate (or the raw data if $n=2$). So the discussion rests of there existing examples where $n=100000$ and no effect estimates or standard errors/standard deviations/confidence intervals are given. I personally cannot imaging such a paper being published. I guess you refer to a tertiary endpoint mentioned only in passing. — Frank Harrell, Mar 25 '14 at 17:11
@FrankHarrell That's exactly my problem here. There are no effect estimates or raw data, no standard deviations or mean values, no t or f statistics, NOTHING except the statement that there were no significant differences between the groups when using a certain statistical test. And I can assure you that a ton of studies like these are published since they're currently clogging up my desk :) And yes, as I mentioned above, most often, what I'm looking for doesn't even form a hypothesis in the studies I'm looking at. It just so happens that it's sometimes tested for in passing. — Speldosa, Mar 26 '14 at 12:39
An unusual situation indeed, pointing out once again the harm that hypothesis testing has done to science. Must producers of $P$-values fail to understand that large values of $P$ are non-informative unless you know $n$ and the precision of the estimate. Fisher's answer was "get more data". — Frank Harrell, Mar 26 '14 at 12:46
@JeremyMiles i know you provided your 'effect size estimate based on p-value of 0.5' suggestion a while back - do you know of any work i can cite if i adopt this approach? I'd be most grateful for any help! — , Jan 18 '16 at 16:13

Wolfgang · Accepted Answer · 2014-03-25T12:46:28.017

As you point out, there are merits with all three approaches. There clearly isn't one option that is 'best'. Why not do all 3 and present the results as a sensitivity analysis?

A meta-analysis conducted with ample and appropriate sensitivity analyses just shows that the author is well aware of the limits of the data at hand, makes explicit the influence of the choices we make when conducting a meta-analysis, and is able to critically evaluate the consequences. To me, that is the mark of well-conducted meta-analysis.

Anybody who has ever conducted a meta-analysis knows very well that there are many choices and decisions to be made along the way and those choices and decisions can have a considerable influence on the results obtained. The advantage of a meta-analysis (or more generally, a systematic review) is that the methods (and hence the choices and decisions) are made explicit. And one can evaluate their influence in a systematic way. That is exactly how a meta-analysis should be conducted.

score 4 · Answer 2 · answered Mar 25 '14 at 17:45

Here are the steps that I would take (and that I teach to my students):

1) Contact the authors of the original research. Be polite and request exact effect estimates to use in your meta-analysis. Worst thing that can happen is that they don't reply or refuse to give you the information. Best case scenario is you get the exact information you were looking for.

2) If you have exact p-values, you can often back calculate SD's with some degree of certainty.

3) You make some sort of imputation. This could be using 'borrowing' the effect estimate from similar sized trials, largest SD in the meta-analysis, SD from similar studies in the same meta-analysis, expert opinion, etc. There are many ways to impute the missing data, some more scientifically correct than others, but the most important thing is that you are crystal clear about what you did and to conduct a sensitivity analysis to determine the effect of the imputation(s) on the pooled effect estimate.

3) You put them the studies into the meta-analysis with the missing data. The program (e.g. RevMan) will not give these studies any weight in the analysis because it won't be able to calculate the effect estimate and variance for that study, but you will be able to visually show that there were additional studies with partial data that were not part of the pooled calculation.

4) You don't include data from these studies.

Pick you poison...

In a meta-analysis, how should one handle non-significant studies containing no raw data?

2 Answers2

Linked