Test if people drop out or decrease bets after repeated losses

Question

I have data on a series of winning and losing bets over 5 rounds of betting with attrition after each round. I am using a decision tree like the following to display the data.

enter image description here

The nodes towards the top of the tree are those that are having winning bets, and those towards the bottom of the tree are having runs of losing bets. I want to look at (a) attrition at each node (b) changes in mean bet sizes at each node. I'm looking at the rate of attrition at each node from the previous node, and the survival rate (using the expected amount of people at each node if the probability is 50%). For example, if the probability is 50% at each node, out of the 1000 that started, roughly 500 people should be in each of the second nodes, W and L. The hypothesis is (a) the rate of attrition is higher after losing bets (b) mean bet size is reduced after losers and raised after winners.

I just want to do this in a very simple univariate setting first. How can I perform a t-test to show the the change in mean bet size from node WW to node WWW is statistically significant if 50 people have dropped out? I'm not sure this is the right approach: each subsequent bet is independent, but people are dropping out after losers, so the sample is not matched. If it was just a case of the same class taking a series of exams one after the other with no-one dropping out, I'd understand how to perform the appropriate t-test, but I think this is a bit different.

How can I do this? Also, if the results are being skewed by a small number of customers, how could I take out the top 5% and bottom 5%? Just remove the customers with the highest cumulative stake size from bet 1 - 3?

I have the data from which the figure was generated, so I have the mean, std, std error etc at each node.

The line that should be WL is labelled WW. The error propagates down that line. Is it that all you have is this figure or do you have the data from which the figure was generated? — John, Nov 14 '13 at 20:37
I'm trying to figure out if it's possible to tell from this where the attrition occurs. The N is the people who made a bet but not the people who actually got there. For example, 450 go W but then what comes out is 250 and 180. So, 20 are gone but did those ones win or lose? — John, Nov 14 '13 at 21:16
I have the data from which the figure was generated, yes. I've since edited the tree to correct the error you pointed out and changed some of the end node to replicate the kind of attrition in the real dataset. You're right that the attrition isn't clear at the moment. I'll edit the tree again over the next few minutes to show a bit more data. Thanks. — user2146441, Nov 14 '13 at 21:16

score 1 · Answer 1 · answered Nov 19 '13 at 20:59

It almost seems "obvious by looking" that losers were more apt to drop out than winners.

You could try a set of contingency tables to establish whether the above is statistically significant. For instance, of the 450 winners of the first bet, 25 dropped out and 425 stayed and of the 550 losers, 150 dropped out and 400 stayed. Etc.

score 0 · Answer 2 · answered Mar 16 '15 at 16:03

This response will probably be a bit off topic, but I'll start with what's on topic. If I were asked specifically to determine whether the change in mean bet size from WW to WWW were significant, I would ignore the people who did not reach both of these nodes. If the goal of this analysis is to be able to make predictions for future behaviour, then the mechanics of the trial, should do well to emulate the mechanics of future behaviour, even if the game is not a game of chance. What is the point of measuring how someone's bet would change from WW to WWW if they're not the type of person to go from WW to WWW.

That being said, in general we obviously don't like to systematically exclude certain populations. If I were given this data I would focus on the more doable types of analysis. Most notably (especially if this is not a game of chance) the players at a similar node have a lot in common. They have had the same sequence of (W,L) and have no left. Answering questions along the lines of "What is the effect of losing a giving round on bet size and attrition," is quite doable while controlling for the node dependent behaviour, in the form of a multi-level model.

A last piece of advice would be to focus on player level differences from round to round. The mean bet going down by 5 cents after a bit may be statistically insignificant, while 90% of the players bets going down probably will be.

Test if people drop out or decrease bets after repeated losses

2 Answers2