Adding to @Jordan's answer above, in my practical experience in a federal agency that funds a lot of program evaluations, if we see results that are not statistically significant at conventional levels, we don't typically bother to even read anything into the effect sizes. The idea being here, if the effect is not statistically significant, we cannot rule out random chance that there is an effect at all, so however large the effect size is, it may not be different from zero upon repeated sampling.
There are, clearly, problems with "statistical significance", especially when we are talking about small sample sizes. I don't really want to get into philosophical arguments here, and I'm certainly not one to defend an approach based on p-values. This is to just give you some practical context for how (some) researchers and evaluators in the federal government respond to this sort of thing.