Random classification forests for extremely sparse response variables

Asked Feb 06 '14 at 18:39

Active Feb 06 '14 at 18:39

Viewed 163 times

I have a response variable that can be $A,B,C$. It is very sparse, meaning 99% of the sample is $B$ and the rest is approximately evenly divided between $A$ and $C$.

How do I predict this variable in a random classification forest? I am looking for guidelines:

Can I use the standard classification splitting criterion with such a sparse response variable?
Given the asymmetric damage an out of sample misclassification would do (i.e. classifying A or C correctly is most important and B correctly is a lower priority), how do I apply some kind of asymmetric loss function here?
Are there other special things I need to take into consideration when modelling such a sparse response variable?

edited Apr 13 '17 at 12:44

Community

asked Feb 06 '14 at 18:39

Jase

1,904
3
20
33

Random classification forests for extremely sparse response variables

0 Answers0