The Universal Approximation Theorem says that under certain conditions on your activation function, you can approximate any bounded continuous function with a feedforward neural network.
I believe this result can be extended to LSTMs, since with certain choices, the LSTM can be simplified into a feedforward network.
Is there a stronger result for LSTMs?