Is the beta distribution really better than the normal distribution for testing the difference of two proportions?

Question

I'm working at an online agency, where we run a lot of AB testing in order to test differences in proportion between two groups (test vs. control). Standard practice in the industry for testing difference of proportions is either based on the normal distribution or chi-squared.

Chi base $\lambda$ tests tend to use a lot of data, which you don't always have, while normal distribution tests are problematic, since proportions are bound by $0$ and $1$, unlike the normal approximation. I claimed to my colleagues that a test that uses the beta distribution should always over perform both normal / chi options, since its built for proportions.

Here is my R code to perform the test, this test shows that test over performed the control group (95%):

library(ggplot2)
number_of_success_test      <-   46
number_of_success_controll  <-   33
number_of_failures_test     <- 2643
number_of_failures_controll <- 2579
test1 <- rbeta(100000, number_of_success_test,     number_of_failures_test,     ncp=0)
test2 <- rbeta(100000, number_of_success_controll, number_of_failures_controll, ncp=0)
test  <- data.frame(test1, test2)
quantile(test2, 0.95)
g <- ggplot(data=test, aes(x=test1)) + 
                       geom_density(color="red",  bindwidth=0.0000001) + 
                       geom_density(aes(x=test2), bindwidth=0.0000001) + 
                       geom_vline(xintercept=quantile(test2, 0.95)) + 
                       geom_vline(xintercept=quantile(test1, 0.5),
            color="red")
g + xlab("CR") + geom_text(label="95 pecentile - control group", 
                           x=quantile(test2, 0.95), y=15000) + 
                 geom_text(label="50 pecentile - test group", 
                           x=quantile(test1, 0.5), y=12000, color="red")

Am I right? Is it really always better to use beta distribution over chi / normal distribution, when dealing with difference in proportions? (Also, is my approach in the R code right?)

What is your hypothesis? What is your test statistic? What is the sampling distribution of that test statistic under the null hypothesis? My point is that you cannot claim that a particular distribution is more "suitable" than another purely based on its distributional properties such as its support. The chi-square test and other tests based on a normal approximation are what they are because that is how the test statistic is asymptotically distributed. Then there is the Fisher exact test which needs no explanation as to why it is suitable--it is based on the exact distribution. — heropup, Jun 30 '14 at 16:05
Another way to get at the issues @heropup raises is: what would it mean for one of these to be "better" than the other in your case? — gung - Reinstate Monica, Jun 30 '14 at 16:12

score 9 · Accepted Answer · edited Apr 13 '17 at 12:44

9

From your code (and my knowledge of AB testing), I gather your proportions come in discrete increments. That is, for every person who visits a site, they end up categorized as a "success" or a "failure". In other words, your proportions come from a finite number of Bernoulli trials; they are not continuous proportions. As a result, the beta distribution (which is for continuous proportions) is not really appropriate here. Instead, you should use the binomial distribution. Provided your $n$'s are large enough relative to the proportion of successes, the normal approximation is quite acceptable (the standard rule of thumb is that the lesser of $np$ and $n(1-p)$ should be $>5$, in your case those values are $46$ and $33$). I would go with the chi-squared test in your situation, and not use the beta distribution.

If you didn't have enough successes to trust the normal approximation, you could use a permutation test, as @jbowman discusses here: The $z$-test vs. the $\chi^2$-test for comparing the odds of catching a cold in two groups.

On the other hand, if your proportions were continuous (e.g., the mass of a tumor as a proportion of the mass of an organ), the beta distribution would be preferable. You could use beta regression in an ANOVA-ish way (i.e., only having categorical predictor variables). I have a simple example of beta regression in R that could be adapted to such a situation here: Remove effect of a factor on continuous proportion data using regression in R.

edited Apr 13 '17 at 12:44

Community

1

answered Jun 30 '14 at 16:32

gung - Reinstate Monica

132,789
81
357
650

Thanks for this great answer, and i indeed saw some many who used the binomial distribution, but i must say that im still comfused. the binomial distribution (which has a upper bound of n not 1) describes a random variable of which you know in advance N and P, while the beta distribution describes the oppisite - what is the distibution of P (which you really dont know) as a function of secceses and failures (which is really what you do know). my methodoligy (see R code) compares the 95 percentile of the control group (distribution under HO) to the test statistic of the test group (mean). – Yehoshaphat Schellekens Jun 30 '14 at 17:01
Is my methodoligy right? thanks for the advice – Yehoshaphat Schellekens Jun 30 '14 at 17:02
1

I don't believe your methodology is correct, @YehoshaphatSchellekens, sorry. Eg, you seem to be using the numbers of successes & failures as the `shape1` & `shape2` parameters, but that isn't what those parameters are. – gung - Reinstate Monica Jun 30 '14 at 17:26
1

As for using the binomial distribution in a literal way, it's just a matter of how you conceptualize what you are doing. You could use the chi-squared or z-test, in which case it's harder to see the connection, or you could test the *number of successes* directly, in which case it is more obviously binomial. From the way you describe the issue, I wonder if you are coming from a Bayesian perspective where you use the beta dist as a prior for the P of a binomial, In frequentist statistics, we don't think of P as having a distribution. – gung - Reinstate Monica Jun 30 '14 at 17:31
1

As for the bounds & the normal approximation, it is true that the normal goes to infinity in both directions, but after a certain distance it is so close to 0 as to be negligible. That is why you just need to check that P is far enough from 0 or 1 given your N (ie, the np>5 part). – gung - Reinstate Monica Jun 30 '14 at 17:31
i explain a bit more, its true tha P doesnt have a distribution, but " what is probably the value of P" given succes & failure, will change as a function of succes / failure, indeed i saw this attitude in a "beysian " site, but i dont see any reason why not to use it, ill be really happy for more advise, thanks so far!!! – Yehoshaphat Schellekens Jun 30 '14 at 17:35
2

The whole theory behind what you are doing, how the tests work, and what they mean differs between the Bayesian and frequentist approaches to statistics. It is generally not best to mix them, it will just lead you into problems, as apparently it has done here. To learn more about all of this, it may help you to read our threads categorized under the [tag:bayesian] tag. – gung - Reinstate Monica Jun 30 '14 at 17:48
1

OP: Your concern about the binomial (way back in the early comments) being for counts is simply a matter of scaling. Divide the count of successes by $n$, the number of trials (possible number of successes), and that *scaled* binomial is on the multiples of $\frac{_1}{^n}$ that lie between 0 and 1 inclusive. As such a binomial is an appropriate model for count-proportions. – Glen_b Jul 01 '14 at 01:22
OK, i think ill take your advice and use the Binomial distribution, thanks again! – Yehoshaphat Schellekens Jul 01 '14 at 04:41
This entire discussion fails to address the fact that the Beta distribution is the conjugate prior to the Binomial distribution, and that the posterior Beta distribution of a proportion has concentration parameters equal to the sum of the prior Beta concentration parameters and the counts in the respective categories. – Brash Equilibrium Jul 06 '14 at 06:07

score 0 · Answer 2 · answered Dec 03 '14 at 19:17

As other commenters said, number of successes is distributed binomially. Therefore, if you want to sample/simulate, use rbinom().

That said, beta distribution is a conjugate prior for binomial distribution. Therefore, if you want to obtain distribution of the parameter of your binomial distribution using observations, use dbeta().

Is the beta distribution really better than the normal distribution for testing the difference of two proportions?

2 Answers2

Linked