3

My goal is to predict y, but my dependent variable y has more than 20 levels. I dont think multi-nomial model would be a good choice ? Any suggestions or pointers on what modeling methodology I should explore for this problem is much appreciated. Thanks in advance.

bison2178
  • 457
  • 3
  • 13
  • 1
    [Ordinal logistic](https://en.wikipedia.org/wiki/Ordered_logit) models are often used when $y$ is ordered - proportional odds being the most common assumption. But why don't you think a multinomial model would be a good choice? – Scortchi - Reinstate Monica Jun 03 '16 at 18:13
  • @Scortchi, data is not Ordinal. It just seems odd running a multinomial model on a `y` that has so many levels. This maybe due to my inexperience on this issue. – bison2178 Jun 03 '16 at 18:51
  • 1
    Just wondered. I can't think of anything specifically contra-indicating multinomial regression in this case, though of course you'll have a lot of coefficients to estimate & over-fitting'll become a problem sooner than with fewer levels. If you're classifying rather than just predicting the probability of class membership, then think about @hxd1011's point. Is each of the more than 380 possible misclassifications equally bad? – Scortchi - Reinstate Monica Jun 05 '16 at 09:04
  • @Scortchi Thanks Scortchi now I am confident about my instincts on this topic – bison2178 Jun 09 '16 at 04:29

1 Answers1

1

Predicting a discrete outcome with too many levels is a hard problem. Usually people do one vs. others approach, where you build many models and each model can detect one specific level of the output.

Here is why: Think about you have a 100 side-dice, and you know the true distribution. Where $P(S_1)=0.1$, and $P(S_2)=P(S_3)=P(S_{100})=0.9/99=0.009090$. Now what you do with Maximum a posteriori estimation? You will always guess you get the first side $S_1$, since it has largest probability comparing to others. However you will get wrong $90\%$ of the times!!

For details, please check my answers in this post

Haitao Du
  • 32,885
  • 17
  • 118
  • 213