H2O: Can I use the h2o for time series predictions?

Question

I understand that there is not a specific model for time series modeling in H2O. Is there a workaround in order to use Deep Learning or/and GBM? Is some kind of data transformation necessary? are there any examples?

Are there any plans for ARIMA or LSTM?

score 3 · Accepted Answer · answered Jul 11 '17 at 15:18

You can use H2O for time series, and you would normally do some data engineering to create time-based features. In my book (Practical Machine Learning with H2O) one of the three main data sets is prediction of football match results, so that shows some of the techniques.

I normally do things like arima and adf.test in R, and use the outputs as features I load into H2O. Though that is not ideal if your data set is one that won't fit in memory (one of the key advantages of H2O over R). There are two feature requests, which you could comment on or vote for: https://0xdata.atlassian.net/browse/PUBDEV-2590 and https://0xdata.atlassian.net/browse/PUBDEV-4153, but it appears no-one is working on them yet.

LSTMs should be available from H2O using DeepWater (i.e. using TensorFlow or MxNet as a back-end). I'm still hunting for a tutorial specifically on this, myself.

as a matter of fact, I already have your book. I submitted errata (twice)! further to this (which was an H2O bug) congratulations for the nice book. I expect the sequel. But I skipped the football dataset, because classification was what I needed back then... and since then... I totally forgot about it ! So I will try to add moving average columns. My data fit in memory so I can use ARIMA, I already did actually. Do you suggest to apply ARIMA to my whole dataset and then use the forecast as an extra column to the data I load into H2O? — erculeo, Jul 15 '17 at 20:31

score 2 · Answer 2 · answered Jul 11 '17 at 12:50

Methods designed especially for time series work better for such data then black-box machine learning algorithms as shown, for example, in this blog entry. The time-series models take into consideration the time-dependence of your data, while the general purpose methods do not. Of course, you can add to your data additional columns with lags, but then still you would be assuming that $Y_{t-4}$ is some distinct variable that does not have to have anything in common with $Y_{t-3}$, or $Y_{t-5}$... You could think of some more complicated transformation of your data so to try to imitate what the time-series models do, but then, why to re-invent the wheel..?

As about H2O, you should ask the authors. (However, as it is a general purpose machine learning software, so I doubt they will be interested in implementing some specialized models.)

thanx, nice blog entry, I didn't know the package – erculeo Jul 15 '17 at 20:24 — erculeo, Jul 15 '17 at 20:24

H2O: Can I use the h2o for time series predictions?

2 Answers2