VGAE
-
Variational Graph Autoencoders (VGAE) apply the techniques from VAEs to graphs.
-
Each node in the graph is associated with a feature encoded in matrix
matrix). Each node is also associated with a latent variable encoded in matrix . The goal is to infer latent variables and decode edges -
We have the following components.
- An encoder to learn the latent variable distribution. We denote this as
. We assume nodes are independent. Whereis the Normal Distribution and denotes a GCN. The prediction is done using Each is a learnable parameter. - A decoder to predict connections between nodes given latent variables. We denote this as
. We assume conditional independence among possible edges for tractability. The decoder has no learnable parameters. - The prior distribution of latent variables is set to the standard Gaussian. That is
- An encoder to learn the latent variable distribution. We denote this as
-
The objective is to learn the following
- The model is lightweight. All parameters are on the encoder.
- Some downsides include
- Learning latent representations is not a guarantee of performance.
- The assumption of independence may be limiting.
- The prior distribution may be a poor choice
- Need to account for the sparsity of the graph in the decoder.
Deep Graph Infomax
-
Deep Graph Infomax aims to maximize mutual information as possible between the latent representation of nodes (local information) and the latent representation of the graph (global information).
-
Node representations are obtained using an encoder
. We haveThe local representation for nodes
contains local information near node . (i.e., information is passed between neighbors that are steps away)Similarly for the graph representation, we use a readout function
(i.e., pooling) such that -
Let
be a classifier that predicts whether and comes from a positive sample or a negative sample. Let and be the number of positive and negative samples respectively. Also let e the -th node representation from the negative sample.The objective is negative binary cross entropy.
-
We perform positive sampling by sampling nodes from the graph to obtain
-
We perform negative sampling using a corruptor function
. applied to the graph data as follows