1

I recently joined a company running a lot of A/B tests with many bad practices. Although I can explain why most of them are flaws, there is one that I can't really wrap my head around.

Let's use an example. I want to measure the difference in Average Revenue Per User (ARPU) between 2 versions. Instead of calculating a sample size beforehand to know when to stop my test, I am monitoring the difference of ARPU between the two groups every morning. During the first few days this value will oscillate quite a lot before converging steadily towards 1 value after N weeks. If this value is positive, then I argue that my test group has an higher ARPU than the control group by X$ (if test is statistically significant).

This does not seem rigorous at all but I have a hard time explaining why in layman terms because it sounds quite intuitive to do so.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
  • Does this answer your question? https://stats.stackexchange.com/questions/244646/why-is-it-wrong-to-stop-an-a-b-test-before-optimal-sample-size-is-reached/244664#244664 – mribeirodantas Dec 17 '21 at 10:28
  • Thank you for your answer. Yes and No in the sense that I am comfortable explaining why this is bad practice to stop early when looking at the p-value but it does not translate easily to my question since they are looking at the daily convergence of the metric of interest and not the p-value. – Thomas Reynaud Dec 17 '21 at 10:47
  • 1
    Look at [tag:sequential-analysis], and maybe add the tag – kjetil b halvorsen Dec 18 '21 at 23:25
  • The reason why this approach (of stopping when achieving a small p-value) is seriously flawed is discussed at https://stats.stackexchange.com/questions/310119. – whuber Dec 18 '21 at 23:28

0 Answers0