0

I would like to use Machine learning models on top of multivariate time series data to forecast long horizons (for example 400 items and their historical sales in the last year & content features).

From many papers, blogs and Kaggle notebooks I understood that the time series must be stationary, before I am using classical ML algorithms. The reason that ML models such XG-boost\ Cat-Boost can't extrapolate to the feature .

If I enforce the variance and mean being stationary & adding seasonal attributes ( such as LAGS ) , then it should be fine :)

To make variance stationary I can use log , or Box-Cox power transformation.

Though for mean , I can’t find a practical approach to enforce stationarity . I tried to use differencing – but since I have a long horizon ( such as future 90 points ) I got very bad results.

And I do familiar with two types of trend : stochastic and deterministic

some one can assist with how to enforce stationary to the mean , and then transform back the predictions to their original scale ? And if anyone has some Python code example to such task it will be great ! :)

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
Boris
  • 101
  • Hi: AFAIK, differencing is the only way to possibly obtain stationarity in the mean. But even that might not work if the trend is not smooth. – mlofton Oct 06 '19 at 13:35
  • It's not the only was. Harmonic terms can be filtered out through Spectral analysis. – Michael R. Chernick Oct 06 '19 at 14:02
  • @mlofton, differencing works for stochastic trends. There can be deterministic trends, too, in which case one can model that and subtract the fitted trend from the series to achieve stationarity. – Richard Hardy Oct 06 '19 at 19:07
  • @Richard Hardy: Good point about fitting trend. Michael ChermicK: Good point also but I don't think that's used to achieve stationarity but rather for de-seasonalizing ? I guess if the seasonality is the only non-stationarity, then you are correct. thanks for both comments. – mlofton Oct 08 '19 at 01:13

0 Answers0