I have a data from 5 classes and I would like to build a classifier. However the number of feature vectors in each class is very different. One has about 5000, one about 200,000, one about 1,000,000, one about 10,000,000 and one about 1,000,000,000.
As the largest class is too large to build a classifier with I will have to down sample it in any case.
I am currently using scikit learn and Random Forests although I can use another tool if that would be better. IF it were a binary classification problem I could have trained with balanced classes and computed the ROC curve to get the false positive rate I can tolerate. However I have no idea what the right thing to do in this multiclass case is.
Are there best practice recommendations for what to do in practice in this situation? I don't want the classifier to simply ignore one of the classes for example.