Unsupervised Graph Neural Network

VGAE

Variational Graph Autoencoders (VGAE) apply the techniques from VAEs to graphs.
Each node in the graph is associated with a feature encoded in matrix matrix). Each node is also associated with a latent variable encoded in matrix . The goal is to infer latent variables and decode edges
We have the following components.
- An encoder to learn the latent variable distribution. We denote this as . We assume nodes are independent. Where is the Normal Distribution and denotes a GCN. The prediction is done using Each is a learnable parameter.
- A decoder to predict connections between nodes given latent variables. We denote this as . We assume conditional independence among possible edges for tractability. The decoder has no learnable parameters.
- The prior distribution of latent variables is set to the standard Gaussian. That is
The objective is to learn the following
- The model is lightweight. All parameters are on the encoder.
- Some downsides include
  - Learning latent representations is not a guarantee of performance.
  - The assumption of independence may be limiting.
  - The prior distribution may be a poor choice
  - Need to account for the sparsity of the graph in the decoder.

Deep Graph Infomax aims to maximize mutual information as possible between the latent representation of nodes (local information) and the latent representation of the graph (global information).
Node representations are obtained using an encoder . We have

The local representation for nodes contains local information near node . (i.e., information is passed between neighbors that are steps away)

Similarly for the graph representation, we use a readout function (i.e., pooling) such that
Let be a classifier that predicts whether and comes from a positive sample or a negative sample. Let and be the number of positive and negative samples respectively. Also let e the -th node representation from the negative sample.

The objective is negative binary cross entropy.
We perform positive sampling by sampling nodes from the graph to obtain
We perform negative sampling using a corruptor function . applied to the graph data as follows