2

From the Book"fundamentals of neural network", the input layer of a feedforward neural network has linear activation function. Elman recurrent NN is the same as a feedforward except that it has context layer. What should be the activation functions of each layer in an elman recurrent NN?

I have searched for this answer from many sources but many journal papers do not mention about the activation functions. Some information online says that the hidden layers should have tansig activation function.

So, What should be the activation functions of each layer in an elman recurrent NN?

Franck Dernoncourt
  • 42,093
  • 30
  • 155
  • 271
user7085565
  • 97
  • 2
  • 11

1 Answers1

1

An Elman network leaves the choice of the activation function to the user, since it only specifies these equations for the recurrence:

\begin{align} h_t &= \sigma_h(W_{h} x_t + U_{h} h_{t-1} + b_h) \\ y_t &= \sigma_y(W_{y} h_t + b_y) \end{align}

Variables and functions:

  • $x_t$: input vector
  • $h_t$: hidden layer vector
  • $y_t$: output vector
  • $W$, $U$ and $b$: parameter matrices and vector
  • $\sigma_h$ and $\sigma_y$: Activation functions

{1}, which is one of the most cited references defining an Elman network, doesn't seem to indicate that an Elman network should use some specific activation functions either.

FYI Comprehensive list of activation functions in neural networks with pros/cons


References:

Franck Dernoncourt
  • 42,093
  • 30
  • 155
  • 271