Architectural Details
Links

Graph View

Backlinks

Transformer Model
Machine Learning
Encoder-Decoder Network
Extending Context in Transformers
Language Model
Geometric Deep Learning
Graph Neural Network
Supervised Learning
Communication with MARL
Heterogeneous MARL
MARL Research
MARL from a Game Theoretic Perspective
Value Reward Decomposition
Backpropagation through Time
Sequence Based Algorithms
The Evolution of Recurrent Neural Networks
Time Series Analysis

The Library

Search

Recurrent Neural Network

Mar 10, 2025, 3 min read

A Recurrent Neural Network is a class of Neural Network where the computational graph may contain cycles.
- They are best suited for sequential tasks especially since, unlike vanilla neural networks, they do not have limited context i.e., for the same amount of weights as a regular neural network, we can in theory process an infinite length sequence by using hidden states.
- Note that the context window itself is finite and in practice small.

Architectural Details

A hidden state is a state which is not necessarily observed, but which holds some form of latent representation about the inputs. Typically ,it is used to aggregate sequential data.
- We use hidden states to avoid having to store many parameters since we are looking at the input’s values at time steps away.
  
  Let denote the hidden state at time step and as the input.
  
  We calculate the hidden state as
  
  That is, for a RNN, we want the current state to be dependent on the previous state. However, unlike the Markov Property, we actually do retain some information about all previously seen states so far.

RNN computation. Image taken from Zhanng et al.

We make use of Recurrent Layers. These are layers which use hidden states obtained from previous computations.

More formally, Let be the input at time step . be the hidden layer output of time step . be an activation function.

We perform the calculation of the output as
- Where and are weights ,and is a bias term.
- The output is then a function of . Let be an activation function, then the output is given by Where where is the output dimension.
Typical training involves using Backpropagation through Time.

Links

Dive into Deep Learning by Zhang, Lipton, Li and Smola
The Evolution of Recurrent Neural Networks

Created with Quartz v4.1.0, © 2025

Have an issue Send an issue here

GitHub