Say I have binary variables:
- tall
- basketball player
- football player
- vegan
- programmer
- student
We are also given subsets of a population.
My objective is that I want find out what combination of variables best describes how a subset is significantly different that the rest of the population.
For example, say in the entire population of 1000 people there are only 5 people that are vegan, NOT tall, and are basketball players. And all 5 of those people belong to subset A with a count of 20 people. Intuitively I know that {vegan, NOT tall, and basketball players} are an interesting combination of variables that distinguishes subset A from the rest of the population.
What types of statistical analysis should I look to for doing this in a systematic way?