I've done many models on customer data across time, with many different types of datasets, and it puzzles me that Logistic Regression consistently has small deviations across months, including train and test periods. Random Forests on the other hand, tend to have a vast performance difference between months used for training and months used for testing. The test performance is similar between approaches.
The training periods are extremely correlated, having the same customer each month, but why such a sharp drop when going to test set and only on RF? I'm hesitant to think about overfit since the same thing happens no matter how much I fiddle with the parameters in varying degrees, and test performance is anyway similar to the Logistic.
Is there any theoretical reason for this?