I have a dataset of continuous features and 4 classes. The classes counts are 1793, 246, 103 and 102. Adding data is quite difficult now. I've done classification with a random forest on the entire data and got f1 values of 0.97, 0.67, 0.69, 0.86 for the respective classes, which is not bad since the mistakes also went to an adjacent class and it's not a bad result in my case. The train and test had similar proportions of the classes.
However, I thought about balancing the classes counts by dropping some instances of the dominant classes. I've run a random forest on data after I took only every 8th instance from the first class which gave me over 0.9 f1 score on every class. After this I also tried cutting down more instances to completely balance the 4 classes and got a little lower scores than the second try.
Which of the three is the way to go? the one that got the best scores or is there something I don't know?