Contrastive Learning is a technique for unsupervised representation learning.
We frame the problem as follows. Let be the input samples each with label belonging to one of classes.

The goal is to learn an embedding function such that similar instances are close together and dissimilar instances are far apart

Training Objectives

For the following: denotes the L2 norm ¹ . For a provided input, a positive sample is one from the same class and a negative sample from different class.

We give various loss functions that can be used in the setting.
The Contrastive Loss is defined as follows
$𝟙 𝟙$
Where defines the lower bound distance between different samples.

The Triplet Loss is defined as follows. Select an anchor point and samples which are positive and negative respectively (preferably choose challenging samples).

The goal is to minimize the distance between and and maximize the distance between and
Lifted Structure Loss uses all pairwise edges. For convenience, let and denote as the set of positive and negative sample pairs respectively. Then:

The last equation above is a smooth relaxation of to make it more amenable for learning.
Multi-Class pair loss generalizes triplet loss. Samples are arranged into tuples and we calculate
Noise Contrastive Estimation (NCE) involves distinguishing signal from noise. We define and as positive and negative samples respectively. Let denote the sigmoid. Then
InfoNCE Loss uses the categorical cross entropy loss for positive and negative samples. e

Let be a context vector. Generate and . Also let .

Now, the probability of detecting the positive sample correctly is

Where . The approximator is used to maximize Mutual Information between and .

The InfoNCE loss is then
The Soft-Nearest Neighbors Loss extends InfoNCE to use multiple positive samples.

Let measure the similarity between inputs and be a temperature parameter. Then for a batch we have

A high temperature means representations will tend to be more concentrated.

The Library