- A Recurrent Neural Network is a class of Neural Network where the computational graph may contain cycles.
- They are best suited for sequential tasks
Architectural Details
- A hidden state is a state which is not necessarily observed, but which holds some form of latent representation about the inputs. Typically ,it is used to aggregate sequential data.
-
We use hidden states to avoid having to store many parameters since we are looking at the input’s values at
time steps away. Let
denote the hidden state at time step and as the input. We calculate the hidden state as
That is, for a RNN, we want the current state to be dependent on the previous state. However, unlike the Markov Property, we actually do retain some information about all previously seen states so far.
-
-
In practice, we calculate the hidden states as follows. Let
be the size of a minibatch be the size of each inputs in each example. be the hidden state. be a minibatch of inputs be some activation function Then
Where
and are weights ,and is a bias term. -
We make use of Recurrent Layers. These are layers which use hidden states obtained from previous computations.
More formally, Let
be a minibatch of inputs at time step . be the hidden layer output output of time step . be an activation function. We perform the calculation of the output as
- We parameterize on the weights of the non-hidden states, the weights of the inputs (as in a fully connected layer) and the bias of the output term.
-
Typical training involves using Backpropagation through Time.