Graph neural networks are neural network architectures specifically designed for learning representations of graph-structured data including learning node representation.

More formally the problem solved by GNNs can be framed as follows. Let be a graph and be the Adjacency Matrix. We also let be the attribute matrix on the node features.

The goal is to learn a node representation in dimensions, denoted such that the graph structural information and node attributes are preserved.
Each layer in a GNN has two different functions
- Aggregate information from the neighbors of each node.
- Combine aggregated information from neighbors with current node representations.
The general framework can be described as follows. Let be the representation of vertex at time step .

We have that

Note the similarity to LSTMs and GRUs.

To use the representations, we use the representations at the last layer. That is .

Issues

The following challenges are apparent in graph representation learning and motivate advances in GNNs.
- High Computational complexity which limits its applicability to large networks.
- Low Parallelizability since edges imply coupling between nodes.
- Non-stationarity - samples from graph data are dependent on each other.
Another problem is the Over-smoothing Problem That is, if the network is too deep representations become too similar
One approach to alleviate it is using the PairNorm. The goal is to keep the total pairwise squared distance of node representations unchanged.
- In particular, we want
  
  Or
- The above computation can be simplified by noticing that
  
  Where is the L2-Norm.
  
  Which can be further simplified using
  
  So that
  
  And we want too update using the rule

The Library