I'm struggling with a problem where I'm trying to predict customer churn. I have monthly snapshot data going back several years, and tags for whether a customer left during a given month.
My main question is whether I should be using the entire dataset as my training set? For example, take the March 2014 end-of-month snapshot, and train on whether a customer left or not. That March 2014 EOM snapshot includes the March EOM data for all current customers (or those that left in March), and the time-shifted data for any customers that left prior to March 2014. My thinking is that I CAN use the entire dataset, rather than reserving a test set, because effectively my test set can be the snapshot for April 2014 or May 2014. (Or August 2014, for that matter.)
I want to use the whole snapshot for training because there is a relatively low churn rate (0.02% in a given month). I've tried splitting off a Test set from the Train, and that usually shows good model performance on the Test set. But terrible performance on the subsequent months... (That's probably my real question, but I figured I'd start with getting the Train / Test thing settled.)