Is it ever useful to do a comparison of variance in an A/B test experiment?

Question

Typically in an A/B test or perhaps more generally in an randomized controlled trial, we perform a comparison of means to determine wether there was an effect. However, does one ever do a comparison of variances? For example, suppose the new treatment has the same mean as the control group, but with much lower variance. In this way the new treatment could be considered "better". Would we ever do a comparison of variance to determine that the treatment is "better" than the control or is this not done? If it is not done, why is it not done?

B.Liu · Accepted Answer · 2022-02-21T10:51:02.767

Whether to run a test that compare the variances between two samples is heavily dependent on the hypothesis (and hence evaluation criteria) of an experiment. If the experiment is indeed aiming to change, in full or in part, the variability of the responses from the experiment units, then one should run a test comparing the variances.

I believe the reason tests that compare means is far more popular than test that compare variances in A/B tests - which are nowadays dominated by randomized controlled trials in the tech and digital domains - is that teams running experiments are often tasked to improve on a certain KPI that is the mean response from experiment units.

For example, the group of CRO (Conversion Rate Optimization) people in digital marketing and e-commerce are literally trying to optimize the conversion rate, which is the mean of the zeros and ones corresponding to whether someone "converts" or not.

It is indeed more rare, but not unheard of, for teams to be tasked to reduce the variability of the responses. An realistic enough example can go as follow:

A light bulb company receives many complaints every months, one reason among many is that the light bulbs they produce has a very unreliable lifetime. Say they are effectively normally distributed with a mean of 10000 h with a standard deviation of 2500 h (i.e. a variance of 6,250,000 (h sq.)). This means some of them pops really quickly and some just seem to last forever.

Ultimately, the company wants to reduce the number of complaints per month. At that level, any improvements should still be tested by a test comparing the means.

However, when the company goal filters down to the production department, it may no longer make sense for that department to work on that KPI, as there are too many things out of their control. Maybe there are rude call center agents or unscrupulous sales in the said company. Maybe the production department's contribution on reducing the number of complaints (the signal) just get drowned out by the noise.

What the production department can control concretely via their work though is the variability of their light bulb lifetime. If this is what they are tasked to look at, then it makes a lot of sense to run a test comparing the variances to show they are hitting their goals.

Of course, how the production department is linking the reduction in light bulb lifetime variability to the number of complaints; and whether (as @Henry pointed out) solely reducing the variability of the light bulb lifetime is the right thing to do, is another story.

Reducing the variance of bulb lifetimes leads to the different complaints that everything breaks just after the guarantee runs out. It might reduce perverse incentives for the production department to aim to minimise the proportion which fail before a specified time and not worry whether a few or many last much longer than this — Henry, Feb 21 '22 at 10:38
@Henry This is very true - this is more an attempt of giving a realistic-ish example (i.e. this can realistically happen in a factory with not-the-brightest managers) rather than asserting that it is the right thing to do! I've updated the answer based on what you said. — B.Liu, Feb 21 '22 at 10:47
One area where interest is in variance is in [tag:quality-control]. For an example see https://www.gaebler.com/Reducing-Variance.htm. You might also be interested in https://stats.stackexchange.com/questions/434928/unequal-variance-in-randomized-experiments-to-compare-treatment-with-control and the refs therein — kjetil b halvorsen, Feb 21 '22 at 11:41

Is it ever useful to do a comparison of variance in an A/B test experiment?

1 Answers1