I'm working at an online agency, where we run a lot of AB testing in order to test differences in proportion between two groups (test vs. control). Standard practice in the industry for testing difference of proportions is either based on the normal distribution or chi-squared.
Chi base $\lambda$ tests tend to use a lot of data, which you don't always have, while normal distribution tests are problematic, since proportions are bound by $0$ and $1$, unlike the normal approximation. I claimed to my colleagues that a test that uses the beta distribution should always over perform both normal / chi options, since its built for proportions.
Here is my R code to perform the test, this test shows that test over performed the control group (95%):
library(ggplot2)
number_of_success_test <- 46
number_of_success_controll <- 33
number_of_failures_test <- 2643
number_of_failures_controll <- 2579
test1 <- rbeta(100000, number_of_success_test, number_of_failures_test, ncp=0)
test2 <- rbeta(100000, number_of_success_controll, number_of_failures_controll, ncp=0)
test <- data.frame(test1, test2)
quantile(test2, 0.95)
g <- ggplot(data=test, aes(x=test1)) +
geom_density(color="red", bindwidth=0.0000001) +
geom_density(aes(x=test2), bindwidth=0.0000001) +
geom_vline(xintercept=quantile(test2, 0.95)) +
geom_vline(xintercept=quantile(test1, 0.5),
color="red")
g + xlab("CR") + geom_text(label="95 pecentile - control group",
x=quantile(test2, 0.95), y=15000) +
geom_text(label="50 pecentile - test group",
x=quantile(test1, 0.5), y=12000, color="red")
Am I right? Is it really always better to use beta distribution over chi / normal distribution, when dealing with difference in proportions? (Also, is my approach in the R code right?)