I have a question concerning the data analysis methods to use for a specific situation. Here is the situation:
There is a dataset from an e-commerce site about its customers' purchases with 875 observations total. Each observation consists of 5 values. Scales of value measurement available for each observation (for each customer) are summarised in the following:
- Package Type (Nominal): Type 1, Type 2, Type 3
- Sex (Nominal): Male, Female
- Age Group (Ordinal): NULL, Younger than 20, 20-25, Older than 25
- Location (Nominal Scale): NULL, Region 1, Region 2
- Order Count (Ratio Scale): Integers
NULL represents missing data.
The task is to identify which Package Type is preferred by which client type, composed of Sex, Age Group, Location and Order Count. Put another way: what is the Package Type is most likely given the set of characteristics, consisting of Sex, Age Group, Location and Order Count?
What am I asking for is not a ready solution for this problem - this just wouldn't be so interesting :). I want you to head me towards the methodology that would be used in answering this question. What branch of Statistics might handle this problem? Maybe you could advise me some good classic book covering the subject or the forum thread?