TensorFlow allows you to create MultiRNNCell composed sequentially of multiple simple cells (LSTM and GRU). I usually use same type of cell when creating MultiRNNCell but I was wondering if there could be some benefits in using both LSTM and GRU? Does anyone have some experience with it or theoretical insights?
Asked
Active
Viewed 4,138 times
5
-
2Would be pretty interesting to see. I might be testing the combination of LSTM and NARX cells. – Thomas Wagenaar May 11 '17 at 08:41
1 Answers
2
There are some thousands of variants of RNN cell(kernel) and both LSTM and GRU are for processing the input $x_i$ and the output of the previous state $s_{i-1}$ and producing the output and the current state. Even thought LSTM preceded GRU and GRU contains less computation, LSTM is just on a par with GRU in performance. So, I think stacking LSTM and GRU or any other cells might just interesting but would not make any big difference in improving the performance(compared to simply stacking either LSTM cells or GRU cells).
As George E. P. Box put it, all models are wrong, but some are useful, you can just have a try to see if it is useful or not.

Lerner Zhang
- 5,017
- 1
- 31
- 52
-
+1 for the last sentence: they really don't do things that are that different so there's not that much advantage you'd get from combining them. – Wayne Feb 08 '18 at 14:18
-
When you say that LSTM is on par with GRU, are you making the comparison on a per-unit basis or a per-parameter basis? – Sycorax Feb 08 '18 at 14:59