How to test if two means are significantly similar?

Question

I'm trying to run an experiment to see if my test and control (in this case 2 survey types) are similar, as opposed to significantly different, from each other.

Is there a test/way to do this? almost like an inverse of a t-test?

[In hypothesis testing why do we need to use the reject null hypothesis approach but not the other way round?](https://stats.stackexchange.com/questions/397032/in-hypothesis-testing-why-do-we-need-to-use-the-reject-null-hypothesis-approach) — user2974951, Jul 21 '21 at 07:23

score 8 · Answer 1 · answered Jul 21 '21 at 08:34

It's a bit tricky but what you are looking for is called equivalence testing (see wiki).

The general idea is this: assume we have 2 surveys $A,B$. we would like to formulate our hypotheses over $\mu_A, \mu_B$. An initial significance testing hypotheses would be:

$$H_0:\mu_A=\mu_B,~~~H_1:\mu_A\neq\mu_B$$

we assume the means are exactly the same, meaning their difference is 0:

$$H_0:\mu_A-\mu_B=0=\mu_0,~~~H_1:\mu_A-\mu_B\neq0$$

and then you know what to do (t-test). The next step is defining an indifference region around the null hypothesis. That is, a region of size $\delta$ around the null hypothesis $[\mu_0-\delta,\mu_0+\delta]$ (which in our case is simply $[-\delta,\delta]$) that we are indifferent to. The idea is to tolerate results around the null just as if thy were the null, so we don't reject the null for minor differences. So our updated hypotheses are now:

$$H_0:|\mu_A-\mu_B|\leq \delta,~~~H_1:|\mu_A-\mu_B|>\delta$$

Notice that using absolute values requires a slight change to the statistical test. Still, our null hypothesis is that the means differ by at most $\delta$. It is up to the data (and the statistician) to prove that the means differ.

The final step here is moving to equivalence testing (AKA noninferiority). The most important thing is the conceptual change: rather than assuming by default (i.e the null) that the means are equal, we now assume that they differ. Not only we assume they differ, but the difference is at least $\delta$. We formulate our hypotheses as:

$$H_0:|\mu_A-\mu_B|\geq \delta,~~~H_1:|\mu_A-\mu_B|<\delta$$

So we have inverted the hypotheses, and now it is up to the data (and the statistician) to prove that the means do not differ.

This ain't a trivial move, and it requires choosing both $\delta$ and $\alpha$ prior to observing the data. I hope this answers your question.

Interesting answer! What about when the samples from $A$ and $B$ are not normally-distributed? — Alexandru Dinu, Jul 21 '21 at 09:21
If sample size is large enough you just use $\sqrt{n}(\bar{X}^A_n-\mu_A)/s_A$ which is $\sim N(0,1)$ due to CLT (however it does require that the first two moments of $A,B$ are finite). — Spätzle, Jul 21 '21 at 09:34
+1 @AlexandruDinu There are tests for equivalence of non-normal data also. A simple application is for *z* tests for proportion difference, and these have been extended to *z* approximate tests for rank sum and sign rank tests for equivalence. There are also omnibus equivalence tests (e.g., all means are different by at least $\delta$), and others in the uniformly most powerful test approach to equivalence, but that math is fancier than for TOST. See Wellek, S. (2010). Testing Statistical Hypotheses of Equivalence and Noninferiority (Second Edition). Chapman and Hall/CRC Press. — Alexis, Jul 21 '21 at 20:22
@AlexandruDinu You can also combine tests for equivalence and tests for difference (and thereby side-step confirmation bias in hypothesis testing). See my comments about combined inference in relevance tests in [my answer here](https://stats.stackexchange.com/questions/108911/why-does-frequentist-hypothesis-testing-become-biased-towards-rejecting-the-null/108914#108914). Note also that CV has a tag for the simple two one-sided tests approach to equivalence: \[[tost](https://stats.stackexchange.com/tags/tost/info)], & an \[[equivalence](https://stats.stackexchange.com/questions/3038/)\] tag. — Alexis, Jul 21 '21 at 20:23
Ah, the Wellek book. Spent hundreds of unforgettable hours with it while writing my Master's thesis. — Spätzle, Jul 22 '21 at 04:05

Pitouille · Answer 2 · 2021-07-21T08:05:49.490

2

If this is the case, then I encourage you to determine the confidence interval of the mean difference (the narrower, the better) which will make your demonstration even stronger.

First, I would encourage you to visualize your sample data to check whether your assumption is likely or not. Then, one possibility could be to identify a 95% confidence interval (CI) for the difference of means (or even a 99% if you are sure that they are similar) , you can refer to the "basic steps" chapter of https://en.wikipedia.org/wiki/Confidence_interval. To demonstrate your point, your CI must include 0 and the lower and upper values of your CI must be very close to 0 (this depends of the unit/scale of your test of course). I hope it helps!

edited Jul 21 '21 at 08:05

answered Jul 21 '21 at 07:33

Pitouille

1,506
3
5
16

Hi Pitouille - thank you for the response. Can you expound on this a bit in simple terms? – jc315 Jul 21 '21 at 07:38
just adapted my answer... I hope it reads better! – Pitouille Jul 21 '21 at 08:06
thank you! i think i follow. also referencing other articles to read more on Confidence Intervals for Difference in Means. A couple of questions: 1) In layman terms, what does it mean if my confidence interval does not include 0 and what if it does include 0 (e.g. CI = [-0.18, .11] therefore we believe that the true difference in our means is in this range, close to 0, with 95% certainty?) 2) And do I still need to run a t-test on my experiment? – jc315 Jul 21 '21 at 08:16
Absolutely! The nature of your samples is important to consider (same size or not, paired, equal/unequal variances, etc...). Good luck :) – Pitouille Jul 21 '21 at 08:21
Yes, you have a probablity of 95% to find the true mean difference to fall into this range. Again, the unit/scale of your experiment is important as well. If you are dealing with very small values in your sample, this CI might not be that convincing. So, it is important to draw your conclusion in your context. – Pitouille Jul 21 '21 at 08:28
1

@Pitouille No, that is not how you interpret a CI. This approach that you mention is based solely on subjectively checking the CI, there is no testing here. – user2974951 Jul 21 '21 at 09:48
1

I disagree that the confidence interval has to include zero. That seems to be using an insignificant hypothesis test as evidence in favor of the null. – Dave Jul 21 '21 at 09:54
1

@user2974951 absolutely! This is not a test. The answer from Spätzle is rather interesting. – Pitouille Jul 21 '21 at 09:55
@Dave, indeed... I had a paired-sample in mind... but it will be much less convincing with samples of different sizes and/or unequal variances, which will looks like more to failing to reject the null hypothesis... – Pitouille Jul 22 '21 at 07:05

How to test if two means are significantly similar?

2 Answers2

Linked