Stopping an A/B test by monitoring the convergence of the metric of performance

Question

I recently joined a company running a lot of A/B tests with many bad practices. Although I can explain why most of them are flaws, there is one that I can't really wrap my head around.

Let's use an example. I want to measure the difference in Average Revenue Per User (ARPU) between 2 versions. Instead of calculating a sample size beforehand to know when to stop my test, I am monitoring the difference of ARPU between the two groups every morning. During the first few days this value will oscillate quite a lot before converging steadily towards 1 value after N weeks. If this value is positive, then I argue that my test group has an higher ARPU than the control group by X$ (if test is statistically significant).

This does not seem rigorous at all but I have a hard time explaining why in layman terms because it sounds quite intuitive to do so.

Does this answer your question? https://stats.stackexchange.com/questions/244646/why-is-it-wrong-to-stop-an-a-b-test-before-optimal-sample-size-is-reached/244664#244664 — mribeirodantas, Dec 17 '21 at 10:28
Thank you for your answer. Yes and No in the sense that I am comfortable explaining why this is bad practice to stop early when looking at the p-value but it does not translate easily to my question since they are looking at the daily convergence of the metric of interest and not the p-value. — Thomas Reynaud, Dec 17 '21 at 10:47
The reason why this approach (of stopping when achieving a small p-value) is seriously flawed is discussed at https://stats.stackexchange.com/questions/310119. — whuber, Dec 18 '21 at 23:28

Stopping an A/B test by monitoring the convergence of the metric of performance

0 Answers0