I have data on a series of winning and losing bets over 5 rounds of betting with attrition after each round. I am using a decision tree like the following to display the data.
The nodes towards the top of the tree are those that are having winning bets, and those towards the bottom of the tree are having runs of losing bets. I want to look at (a) attrition at each node (b) changes in mean bet sizes at each node. I'm looking at the rate of attrition at each node from the previous node, and the survival rate (using the expected amount of people at each node if the probability is 50%). For example, if the probability is 50% at each node, out of the 1000 that started, roughly 500 people should be in each of the second nodes, W and L. The hypothesis is (a) the rate of attrition is higher after losing bets (b) mean bet size is reduced after losers and raised after winners.
I just want to do this in a very simple univariate setting first. How can I perform a t-test to show the the change in mean bet size from node WW to node WWW is statistically significant if 50 people have dropped out? I'm not sure this is the right approach: each subsequent bet is independent, but people are dropping out after losers, so the sample is not matched. If it was just a case of the same class taking a series of exams one after the other with no-one dropping out, I'd understand how to perform the appropriate t-test, but I think this is a bit different.
How can I do this? Also, if the results are being skewed by a small number of customers, how could I take out the top 5% and bottom 5%? Just remove the customers with the highest cumulative stake size from bet 1 - 3?
I have the data from which the figure was generated, so I have the mean, std, std error etc at each node.