Cohen's d in non-significant results

Question

I did several independend t-tests. Some of them are significant, some of them are not.

Next, I computed Cohens $d$ for effect size.

For significant results, Cohens $d$ feels intuitively reasonable (How "large" is the found effect between both samples?). But how do I properly interpret Cohens $d$ in non-significant results?

score 4 · Accepted Answer · edited Apr 13 '17 at 12:44

Cohen's d can help to explain non-significant results: if your study has a small sample size, the chances of finding a statistically significant difference between the groups is unlikely, unless the effect size is large.

It's probably a good idea to include a confidence interval for your Cohen's d since the effect size based on your sample is still an estimate of the 'true' effect size.

This can be done easily in R using the tes() function in the compute.es package:

library(compute.es) 
tes(t=??, n.1=??, n.2=??)

Here is a useful short article on effect size thresholds: thresholds for interpreting effect sizes2.

And here is one on combining effect sizes with significance test interpretations (see especially sections 4 and 5): It's the Effect Size, Stupid: What effect size is and why it is important.

Here is a related post on how to interpret the confidence interval of Cohen's d in case you choose to find that: Why is the p-value for Cohen's $d$ not equal to the p-value of a t-test?.

score 3 · Answer 2 · answered Mar 06 '15 at 16:39

3

Adding to @Jordan's answer above, in my practical experience in a federal agency that funds a lot of program evaluations, if we see results that are not statistically significant at conventional levels, we don't typically bother to even read anything into the effect sizes. The idea being here, if the effect is not statistically significant, we cannot rule out random chance that there is an effect at all, so however large the effect size is, it may not be different from zero upon repeated sampling.

There are, clearly, problems with "statistical significance", especially when we are talking about small sample sizes. I don't really want to get into philosophical arguments here, and I'm certainly not one to defend an approach based on p-values. This is to just give you some practical context for how (some) researchers and evaluators in the federal government respond to this sort of thing.

answered Mar 06 '15 at 16:39

robin.datadrivers

2,503
11
16

Interesting point. In my case, I have a sample around 160 to 25 individuals. I tried tes() function and it consequently reveals borders below to above zero. As far as I understood, I can use CI of $d$ to show how instable the approximation to true effect size actually is due to small sample size. – Jens Mar 06 '15 at 16:51
With only 25 observations, if that isn't the population its unlikely readers will put a lot of stock in your results unless the effect size is huge. If I were you (this isn't to say I'm *right*) I wouldn't bother reporting *d* for non-significant results. You could talk about sample size limitations unable to detect an effect - go back to how you have defined the null - no effect. If you cannot reject the null, that doesn't mean there is no effect, just that you didn't find one in your small sample. – robin.datadrivers Mar 06 '15 at 17:04
BTW did you do a power analysis before the study? That is a good idea to do beforehand to figure out the minimum detectable effect size, and can tell you if it's even worth doing the study. – robin.datadrivers Mar 06 '15 at 17:04
Yes, but in the given sample, it's not even possible to get an equal gender ratio. Additionally, it has not been the main interest of study. So saying, just a small extra. Taking your advance into account, I will not estimate $d$ for non-significant results. – Jens Mar 07 '15 at 14:47
This has turned into an interesting discussion. For another view (that conflicts with my original advice which was meant to be entirely practical) you might consider reading a series of Andrew Gelman's works. In particular his recent article: http://www.stat.columbia.edu/~gelman/research/published/retropower_final.pdf But he's written quite a number of compelling (and less technical/more accessible articles) that can be found here: http://www.stat.columbia.edu/~gelman/research/published/ – Jordan Collins Mar 12 '15 at 15:04
And just to add: the reason for adding a confidence interval around the effect size estimate is to acknowledge the uncertainty of this estimate... – Jordan Collins Mar 12 '15 at 15:12

Cohen's d in non-significant results

2 Answers2

Linked