An e-commerce website is testing two different designs for a checkout page. Customers who visit the checkout page are randomly presented with one of the two designs.
The first metric of interest, sales uplift, can be measured by comparing the proportion of customers that finalized the sale (a binary outcome) for each of the two designs.
It is reasonably straightforward to compare these using a test for two proportions.
A second metric of interest is the dollar conversion. This is the final dollar value of the sale, as a proportion of the initial dollar value of the incoming shopping cart.
For example: A customer comes to the checkout page with \$160 worth of items in the cart (initial value). The customer removes some items from the cart and finalizes the sale for \$40 worth of items (final value). The sales conversion is 100% (we still sold the customer something), but the dollar conversion is only 25%.
How can I properly test the difference in dollar conversion for the two groups against a null hypothesis of no difference?
See below for some R code specifying the problem:
# example data
set.seed(1)
total_customers <- 1000
target_control <- rbinom(total_customers, 1, 0.5)
sale_success <- rbinom(total_customers, 1, 0.1)
initial_value <- rexp(total_customers, rate=0.1)
final_value <- runif(total_customers, 0, 1.1) * initial_value * sale_success
sales_data <- data.frame(target_control, sale_success, initial_value, final_value)
# sales conversion - test for two proportions (two-tailed)
n1 <- sum(target_control)
n2 <- sum(!target_control)
p1 <- sum(sales_data[target_control==1,"sale_success"])/n1
p2 <- sum(sales_data[target_control==0,"sale_success"])/n2
pbar <- (p1*n1+p2*n2)/(n1+n2)
z <- (p1-p2)/sqrt(pbar*(1-pbar)/n1+pbar*(1-pbar)/n2)
pval <- 2*(1-pnorm(abs(z)))
# dollar conversion - ??
p1 <- sum(sales_data[target_control==1,"final_value"])/sum(sales_data[target_control==1,"initial_value"])
p2 <- sum(sales_data[target_control==0,"final_value"])/sum(sales_data[target_control==0,"initial_value"])
Some things to consider:
- Initial value & final value are correlated
- Initial value & final value both follow a long-tailed distribution, e.g. the negative exponential distribution
- Sometimes final value will be greater than initial value, e.g. the customer adds more to the cart before finalizing the sale
- Sale success & initial value are correlated, but I haven't specified this in the example code
Update 1: Brumar has suggested that the customer-level change in behavior, for those customers who do finalize a sale, can be compared using a Wilcoxon rank-sum test:
sales_data\$ratios=final_value/initial_value
ratios_A=sales_data\$ratios[sale_success==1 & target_control==0]
ratios_B=sales_data\$ratios[sale_success==1 & target_control==1]
wilcox.test(ratios_A,ratios_B)
I'm still interested to know if there is any way to compare the difference in the overall dollar conversion, i.e. the sum of final values over the sum of initial values?
Update 2: Solved by Brumar.
# permutation test (two-tailed)
p1 <- sum(sales_data[target_control==1 & sale_success==1,"final_value"])/sum(sales_data[target_control==1 &
p2 <- sum(sales_data[target_control==0 & sale_success==1,"final_value"])/sum(sales_data[target_control==0 &
yourGap<-p1-p2
L<-sales_data[,"target_control"]==1
LfilterOnlyBuyers<-sales_data[,"sale_success"]==1
nulldist <- vector(mode="numeric", length=10000)
for ( i in 1:10000) {
Lperm <- sample(L)
LpermInv <- !Lperm & LfilterOnlyBuyers
Lperm <- Lperm & LfilterOnlyBuyers
p1_perm <- sum(sales_data[Lperm,"final_value"])/sum(sales_data[Lperm,"initial_value"])
p2_perm <- sum(sales_data[LpermInv,"final_value"])/sum(sales_data[LpermInv,"initial_value"] )
nulldist[i] = p1_perm-p2_perm
}
pvalue=sum(abs(nulldist) > yourGap)/10000
alpha=0.05
ci_upper <- yourGap + quantile(nulldist, (1-alpha/2))
ci_lower <- yourGap - quantile(nulldist, (1-alpha/2))