Random Forest train-test gap in metrics

Question

I am training a random forest classifier on a dataset with 5000 values, and obtain much better values for the train than the test set, but the test set values are still satisfactory. Doing an extensive hyperparameter search and trying recursive feature elimination indicates that it is impossible to reduce the train -test gap without drastically hurting the test performance.

So by conventional measures (train vs test performance) the random forest is overfitting horribly. But everyone seems to think that random forests don't overfit. I am aware that some people say the training performance is not a meaningful metric (see answer to Random forest is overfitting?). But this argument should apply equally well to any machine learning model. And for most of them, a large train-test gap is a good indication of overfitting.

Is there a conceptual explanation for how the random forest is able to get unrealistically good performance on the train set without harming its prediction on the test set?

Why do you care if it’s overfitting? As you said, you are pleased with the test result and you tested all hyper-parameters, so why does it matter if the model is over fit? What are you going to do if it is? — astel, Dec 02 '21 at 07:29
Did you use the OOB score flag and get the correct OOB error, like the post you linked suggests? — Pik-Mai Hui, Dec 02 '21 at 07:38
The question is why OOB score is the appropriate metric for detecting overfitting. OOB is basically a validation set. Usually, overfitting is detected using the difference between training and test not validation and test. — Daniel Kagan, Dec 02 '21 at 08:05
It is always possible that this is just random. Maybe your train/test split is not random, maybe the test set is harder to predict then the train set. — user2974951, Dec 03 '21 at 06:20

Random Forest train-test gap in metrics

0 Answers0