Assuming online/incremental training is not available for a particular algorithm, and assuming that you have a stream of training data that may or may not change over time (eg log data), what are the disadvantages to the following approach to defend against concept drift?
- Decide in a time window (eg one week) and collect all the training data for that time window
- Train a model on data collected in step 1.
- Store this model in an array.
- When the time window elapses, train a new model on the new time window data.
- Append this model to your "model array"
- Repeat as necessary or as constrained by resources. Older models can be deleted
For inference, gather the predictions from each model in the array and average the results - possibly adding weighting to favour more recently trained models
This seems to intuitively make sense however I haven't found much research supporrting this approach.
What would be the disadvantages of this approach?