I'm struggling with a statistics problem. Sadly, I'm not a statistician. Maybe someone here who knows more than me (a low bar) can offer some pointers.
I have a population that gets divided, randomly, into two pieces, $X$ and $Y$. Now if I just wanted to check if the division is reasonably due to chance, this is easy: I assume a binomial distribution, I'm just counting who ends up in $X$ or $Y$. I compute $ n=|X|+|Y|, σ=\sqrt(np(1-p))$ (and I assume my $p=.5$), and then I compare to the normal distribution. So, for example, if I observed $|X|=45$ and $|Y|=55$, I'd say $σ=5$ and so I expect to have this variation from the mean $μ=50$ by chance 68.27% of the time. Alternately, I expect greater deviation from the mean 31.73% of the time.
But it's not so simple:
I actually want to measure some property of members of $X$ and $Y$. Let's say 25% in $X$ measure positive and 66% in $Y$ measure positive. ($X$ and $Y$ aren't the same cardinality -- the selection process isn't necessarily uniform.) I would like to know if I expect this difference by chance.
To make it slightly concrete without going into too much business specifics, think of a restaurant that is testing their menu design. When people walk in the door, they are invited to look at one of two menus (assigned randomly, $p=0.5$). They can choose to stay or they can go away. Now I measure how many people order then boeuf bourguignon. In other words, I'm testing menu design to see how it influences wanting BB. My question, when 45 people from one menu order BB and 55 people who saw the other order it, is how often this happens by chance.
I don't think this is the same as a simple binomial distribution, but I'm not sure.
This problem is important to me, but I actually have one more that is more subtle. I still have a process that divides people randomly into two populations (the two menus). But now, instead of just measuring consumption of boeuf bourguignon, I also measure how many people order the dauphinois potatoes and the chocolate mousse. Let's call those numbers A, B, and C in each group. I compute the statistic $t = (A-C)/(A+B+C)$. And I have the same question: what are the chances that the differences in this $t$ statistic is due to chance.