I am using Keras RNN Cell to perform parts of speech tagging. The architecture is as follows(I cannot put the code because of privacy reasons) :
- An embedding layer of of 40 units of shape (batch_Size, max_sentence_length, 40)
- tf.keras.layers.SimpleRNNCell(state_size=number_of_tags_in_dataset+20, dropout=0.2,recurrent_dropout=0.0, activation='tanh')
- tf.contrib.layers.fully_connected(units=state_size)
- tf.contrib.layers.dropout(keep_prob=0.6)
- tf.contrib.layers.fully_connected(units=number_of_tags)
- tf.keras.layers.BatchNormalization()
- tf.keras.activations.relu()
- Using tf.contrib.seq2seq.sequence_loss() with AdamOptimizer and gradient clipping of 0.5
- batch_size=32, learn_rate=0.01
The result for ~14 epochs are as follows (due to resource constraint, thats the maximum number of epochs I can run for)
accuracy 0.0172309
accuracy 0.800888
accuracy 0.866243
accuracy 0.893743
accuracy 0.896006
accuracy 0.901575
accuracy 0.899487
accuracy 0.898529
accuracy 0.900531
accuracy 0.902532
accuracy 0.899051
accuracy 0.903055
accuracy 0.901053
accuracy 0.898703
accuracy 0.898703
I have noticed this trend where no matter what hyperparameter changes I make, the accuracy is getting stuck around 89-90%. Can you provide some suggestions to boost the accuracy? I am fairly new to Deep Learning so I have been struggling a lot to optimize my model. I have also tried building Bidirectional LSTMs for the same but they are too slow for the resource constraint and the maximum accuracy that I can achieve there is around 92%.