VGAE

  • Variational Graph Autoencoders (VGAE) apply the techniques from VAEs to graphs.

  • Each node in the graph is associated with a feature encoded in matrix matrix). Each node is also associated with a latent variable encoded in matrix . The goal is to infer latent variables and decode edges

  • We have the following components.

    • An encoder to learn the latent variable distribution. We denote this as . We assume nodes are independent.
      Where is the Normal Distribution and denotes a GCN. The prediction is done using
      Each is a learnable parameter.
    • A decoder to predict connections between nodes given latent variables. We denote this as . We assume conditional independence among possible edges for tractability. The decoder has no learnable parameters.
    • The prior distribution of latent variables is set to the standard Gaussian. That is
  • The objective is to learn the following

    • The model is lightweight. All parameters are on the encoder.
    • Some downsides include
      • Learning latent representations is not a guarantee of performance.
      • The assumption of independence may be limiting.
      • The prior distribution may be a poor choice
      • Need to account for the sparsity of the graph in the decoder.

Deep Graph Infomax

  • Deep Graph Infomax aims to maximize mutual information as possible between the latent representation of nodes (local information) and the latent representation of the graph (global information).

  • Node representations are obtained using an encoder . We have

    The local representation for nodes contains local information near node . (i.e., information is passed between neighbors that are steps away)

    Similarly for the graph representation, we use a readout function (i.e., pooling) such that

  • Let be a classifier that predicts whether and comes from a positive sample or a negative sample. Let and be the number of positive and negative samples respectively. Also let e the -th node representation from the negative sample.

    The objective is negative binary cross entropy.

  • We perform positive sampling by sampling nodes from the graph to obtain

  • We perform negative sampling using a corruptor function . applied to the graph data as follows

Links