Low memory time series input for deep learning

Question

Background

I have some data that looks like this:

time  apples  oranges
 1      5        2  
 2      5        2  
 3      6        2
 4      6        3
 5      7        3

I want to create a sliding window time series from this.

I have converted this into time series data for deep learning by following this guide. Suppose I set the sliding window width to be 3 time steps.

(t) (t-1) (t-2)

So now my data looks like this:

time  apples(t-2)  oranges(t-2)  apples(t-1)  oranges(t-1)  apples(t)  oranges(t)
 3        5            2             5            2             6          2
 4        5            2             6            2             6          3
 5        6            2             6            3             7          3

This format works very well for the problem described above where we only have a few columns and a small sliding window.

Suppose we scale this problem to the following:

time     var-1  var-2  ...  var-80     
 1         5      2           6         
 2         5      2           6         

 :         :      :    ...    :         

400,000    6      1           2

And now we say we want a sliding window of 10,000 time steps. Using the same form as before the output shape will be (390,000 , 800,000). This wont work because a single sample becomes tens of gigabytes and read write times are too slow.

Question

I'm looking for a different way to structure my data that does not explode its size while still allowing it to be fed into an LSTM nerual network.

To provide further context:

train_X = train[:, :]

# reshape input to be 3D [samples, timesteps, features]
train_X = train_X.reshape((train_X.shape[0], 1, train_X.shape[1]))

...
model.fit(train_X, train_y, ...)

train is a table created using the transform described above--every row is a sample containing all the time steps in the sliding window for every var. train_x is the same as train in this case because our class labels are elsewhere.

train_x is reshaped into a numpy array to fit the specifications for keras.models.sequential.fit.

Except I don't want to load all of train_x at once because I don't want to build a table that holds all of the data because I don't have enough memory to build this table for all of my experiments.

I'm looking for a way to train my time series model without needing to aggregate the entire training dataset all at once. I want to only aggregate some reasonable number of rows/samples as needed, on the fly.

Ideas

I am a newbie at deep learning frameworks so please be brutal and specific. It seems like there should be a way to pass indexes to a deep learning framework. I linked an article above that uses pandas.shift() to restructure the data. I ended up doing my own implementation because the one from that link was too slow. I included it because the output of my program matches the output of the linked program.

My transform script relies on this line of code:

for idx in range(0,numRows-sliding_window):
    newData[idx,:] = origionalData[idx:idx+sliding_window+1,:].ravel()

What this does is build the new data table row by row. A row in newData is selected from origionalData all at once. Each row in newData would be an input to my LSTM. Each row by itself is managable in time and space complexity. However, putting all of these rows together into one table is where the space problem stems from.

Why can't I just pass indexes into my deep learning framework haha what an idiot you can...

As mentioned, I am a newbie at deep learning. This is my first large scale deep learning project. I chose to use Keras because it is simple. I know that other frameworks (TensorFlow) are more adaptable at the expense of being less user friendly.

I have researched this problem broadly but haven't been able to make any traction. I was hoping that somebody with a better understanding of deep learning frameworks might be able to point me in a helpful direction.

Thanks!

@AdelsonAraújo "I'm looking for a different way to structure my data that does not explode its size." The motivation for the size of the sliding window is outside of the scope of this question. Nowhere did I mention model performance. — Alabaster, Apr 06 '20 at 13:48
Sorry, "a different way to structure my data that does not explode its size" is too vague. What do you mean with it? You want other features? I do think that a smaller sliding window is a different way to structure your data and does not explose its size. — Adelson Araújo, Apr 06 '20 at 16:16
@AdelsonAraújo I've updated the question to provide further context. Please stop fixating on the sliding window size. I have not mentioned the problem domain. — Alabaster, Apr 06 '20 at 17:28

Tim · Answer 1 · 2020-04-06T19:19:56.580

You do not need to reshape your data to create "sliding windows". There are many deep learning approaches to time-series data, including recurrent networks (e.g. LSTM), one-dimensional convolutional networks, hybrids of the above, or architectures using attention. Neither of them needs re-shaping the data. In each case either you use some kind of for loop to loop through the time, either case by case (RNN's), or with sliding window (convolutional networks, attention), but this all can happen on-the-fly. There's also no problem at all with training the model in batches, this is actually the most common approach.

There are hundreds, if not thousands, of tutorials online, and dozens of books, so I'd strongly encourage to check them before trying to figure out things by yourself, this would be much more efficient. So I'd say that the problem you are describing is non-existent, already solved.

Low memory time series input for deep learning

Background

Question

Ideas

1 Answers1