0

I'm using Keras to build and train a recurrent neural network.

from keras.models import Sequential
from keras.layers.core import Dense, Activation, Masking
from keras.layers.recurrent import LSTM
from keras.layers.normalization import BatchNormalization

#build and train model
in_dimension = 3
hidden_neurons = 300
out_dimension = 2

model = Sequential()
model.add(BatchNormalization(input_shape=((max_sequence_length, in_dimension)))) 
model.add(Masking([0,0,0], input_shape=(max_sequence_length, in_dimension)))
model.add(LSTM(hidden_neurons, activation='softmax', return_sequences=True, input_shape=(max_sequence_length, in_dimension)))
model.add(LSTM(hidden_neurons, activation='softmax', return_sequences=False))
model.add(Dense(out_dimension, activation='linear'))

model.compile(loss="categorical_crossentropy", optimizer="rmsprop")
model.fit(padded_training_seqs, training_final_steps, nb_epoch=5, batch_size=1)

padded_training_seqs is an an array of sequences of [latitude, longitude, temperature], all padded to the same length with values of [0,0,0]. When I train this network, the first epoch gives me a loss of about 63, and increases after more epochs. This is causing a model.predict call later in the code to give values that are completely off of the training values. For example, most of the training values in each sequence is around [40, 40, 20], but the RNN outputs values consistently around [0.4, 0.5], which causes me to think something is wrong with the masking layer.

The training X (padded_training_seqs) data looks something like this (only much larger):

[
[[43.103, 27.092, 19.078], [43.496, 26.746, 19.198], [43.487, 27.363, 19.092], [44.107, 27.779, 18.487], [44.529, 27.888, 17.768]], 
[[44.538, 27.901, 17.756], [44.663, 28.073, 17.524], [44.623, 27.83, 17.401], [44.68, 28.034, 17.601], [0,0,0]],
[[47.236, 31.43, 13.905], [47.378, 31.148, 13.562], [0,0,0], [0,0,0], [0,0,0]]
]

and the training Y (training_final_steps) data looks like this:

[
[44.652, 39.649], [37.362, 54.106], [37.115, 57.66501]
]
jeshaitan
  • 91
  • 7
  • Can you please put the complete code as well as the input data (lat,long,temp) needed online somewhere (say a github gist?). – ruoho ruotsi Mar 19 '16 at 15:29
  • https://gist.github.com/jeshaitan/b2b322772986be3b4c0a This doesnt include the data collection or reqTemp function, but let me know if you would need that too – jeshaitan Mar 20 '16 at 11:43
  • Well to better understand your issue, I need to be able to run your example, for that I need input data. Otherwise, just reading without messing w/ the code, it is tough to get a sense for the increase in loss with each epoch. – ruoho ruotsi Mar 21 '16 at 00:32
  • @ruohoruotsi https://github.com/jeshaitan/migration-lstm the main.py file is split into three parts. part 1 is cleaning all the csv data into the X train, then. part 2 is some functions for fetching the third part of the output data from the internet, and part 3 is the actual LSTM network, which I have now changed to a linear activation function. – jeshaitan Mar 21 '16 at 12:57

0 Answers0