Formula for Bayesian A/B Testing doesn't make any sense

Question

I'm using the formula from Bayesian ab testing in order to compute results of AB test using Bayesian methodology.

$$ \Pr(p_B > p_A) = \sum^{\alpha_B-1}_{i=0} \frac{B(\alpha_A+i,\beta_B+\beta_A)}{(\beta_B+i)B(1+i,\beta_B)B(\alpha_A, \beta_A)} $$

where

$\alpha_A$ in one plus the number of successes for A
$\beta_A$ in one plus the number of failures for A
$\alpha_B$ in one plus the number of successes for B
$\beta_B$ in one plus the number of failures for B
$B$ is the Beta function

Example data:

control: 1000 trials with 78 successes
test: 1000 trials with 100 successes

A standard non Bayesian prop test gives me significant results (p < 10%):

prop.test(n=c(1000,1000), x=c(100,78), correct=F)

#   2-sample test for equality of proportions without continuity correction
# 
# data:  c(100, 78) out of c(1000, 1000)
# X-squared = 2.9847, df = 1, p-value = 0.08405
# alternative hypothesis: two.sided
# 95 percent confidence interval:
#  -0.0029398  0.0469398
# sample estimates:
# prop 1 prop 2 
#  0.100  0.078

while my implementation of the Bayes formula (using the explanations in the link) gave me very weird results:

# success control+1
a_control <- 78+1
# failures control+1
b_control <- 1000-78+1
# success control+1
a_test <- 100+1
# failures control+1
b_test <- 1000-100+1

is_control_better <- 0
for (i in 0:(a_test-1) ) {
  is_control_better <- is_control_better+beta(a_control+i,b_control+b_test) / 
                       (b_test+i)*beta(1+i,b_test)*beta(a_control,b_control)

}

round(is_control_better, 4)
# [1] 0

that means that that $P(TEST > CONTROL)$ is $0$, which doesn't make any sense given this data.

Could someone clarify?

A Bayesian analysis questiob with a `p-value` tag? I thought Bayesians refused to have anything to do with p-values. — Dilip Sarwate, Mar 10 '15 at 17:01
Your right! just thought that it would attract more attention! — Yehoshaphat Schellekens, Mar 11 '15 at 10:31
@YehoshaphatSchellekens if that was the real reason I remove the `p-value` tag since it is not related. — Tim, Mar 11 '15 at 12:47

Tim · Accepted Answer · 2015-03-11T10:27:01.900

17

On the site you quote there is a notice

The beta function produces very large numbers, so if you’re getting infinite values in your program, be sure to work with logarithms, as in the code above. Your standard library’s log-beta function will come in handy here.

so your implementation is wrong. Below I provide the corrected code:

a_A <- 78+1
b_A <- 1000-78+1
a_B <- 100+1
b_B <- 1000-100+1

total <- 0

for (i in 0:(a_B-1) ) {
  total <- total + exp(lbeta(a_A+i, b_B+b_A)
                       - log(b_B+i)
                       - lbeta(1+i, b_B)
                       - lbeta(a_A, b_A))

}

It outputs total = 0.9576921, that is "odds that B will beat A in the long run" (quoting your link) what sounds valid since B in your example has greater proportion. So, it is not a p-value but rather a probability that B is greater then A (you do not expect it to be < 0.05).

You can run the simple simulations to check the results:

set.seed(123)

# does Binomial distributions with proportions
# from your data give similar estimates?

mean(rbinom(n, 1000, a_B/1000)>rbinom(n, 1000, a_A/1000))

# and does values simulated in a similar fashion to
# the model yield similar results?

fun2 <- function(n=1000) {
  pA <- rbeta(1, a_A, b_A)
  pB <- rbeta(1, a_B, b_B)
  mean(rbinom(n, 1000, pB) > rbinom(n, 1000, pA))
}

summary(replicate(1000, fun2(1000)))

In both cases the answer is yes.

As about the code, notice that for loop is unnecessary and generally they make things slower in R, so you can alternatively use vapply for cleaner and a little bit faster code:

fun <- function(i) exp(lbeta(a_A+i, b_B+b_A)
             - log(b_B+i)
             - lbeta(1+i, b_B)
             - lbeta(a_A, b_A))

sum(vapply(0:(a_B-1), fun, numeric(1)))

edited Mar 11 '15 at 10:27

answered Mar 10 '15 at 14:55

Tim

108,699
20
212
390

Hmm... I wonder if you actually tested for speed, because `vapply` is no more vectorozied than the `for` loop, on the contrary, they are basically the same. Nice answer though. – David Arenburg Mar 10 '15 at 23:09
@DavidArenburg `vapply` is approximately 1.1 times faster. It is not much but generally if you can avoid for loops in R in most cases better do so. – Tim Mar 11 '15 at 06:04
@DavidArenburg *Any* vectorized function internally is a kind of loop... But ok, I changed the wording not to be confusing. – Tim Mar 11 '15 at 06:36
1

C/C++/Fortan `for` loops == vectorized; R `for` loop != vectorized. This is basically the definition of vectorized. – David Arenburg Mar 11 '15 at 06:40
Thanks for the answer and this great vectorization discussion as well. @Tim , i did look at the log comment, but i thought R could handle it, guess not. ill use your code, for my needs, speed isn't important. – Yehoshaphat Schellekens Mar 11 '15 at 10:29
1

@YehoshaphatSchellekens the point with logs is not about certain software but general statistical computing. In the example on the site you quote julia code is provided - julia is also very good language for statistical programming and still logs are used. – Tim Mar 11 '15 at 10:34
2

Actually, I've just [asked a question regarding this exact discussion we had](http://stackoverflow.com/questions/28983292/is-the-apply-family-really-not-vectorized) on SO, I may need to rethink my approach towards `vapply` in the future. I hope I'll get some nice answer once and for all. – David Arenburg Mar 11 '15 at 10:35
2

Ok, after long thinking and summing up the comments and the answers I've received on my question, I think I came up with some general understanding of what `vapply` really is. See my answer [here](http://stackoverflow.com/questions/28983292/is-the-apply-family-really-not-vectorized/29006276#29006276) – David Arenburg Mar 12 '15 at 09:47

Formula for Bayesian A/B Testing doesn't make any sense

1 Answers1

Linked