Contrastive Loss (InfoNCE / SimCLR-style)

Sequence Models & Generative Models DS practice problem on Onlearn.

Difficulty: medium.

Topics: Contrastive Loss (InfoNCE / SimCLR-style), Temperature Scaling, Negative Sampling, Cosine Similarity, Log-Sum-Exp Trick, Positive Pair Alignment, Representation Learning, Information Theory, Optimization Theory, Metric Learning, Probabilistic Graphical Models, Self-Supervised Learning, Stochastic Gradient Descent, Embedding Space Geometry, Mutual Information Estimation, Data Augmentation Strategies.

Implement the NT Xent (Normalized Temperature scaled Cross Entropy) contrastive loss function, commonly known as the SimCLR contrastive loss or InfoNCE loss. This loss is the backbone of self supervised contrastive learning frameworks. Given a batch of 2N embedding vectors where each consecutive pair (2i, 2i+1) represents two augmented views of the same input sample, the goal is to push representations of the same sample closer together while pushing representations of different samples apart. Your function receives: embeddings: a numpy array of shape (2N, d) where N is the number of original samples and d is the embedding dimension. Embeddings at indices (0,1), (2,3), (4,5), ... form positive pairs. temperature: a positive float scaling parameter (tau). The function should: 1. Compute pairwise cosine similarities between all L2 normalized embeddings. 2. For each of the 2N anchors, treat its paired view as the positive and all other 2N 2 samples (excluding itself) as negatives. 3. Compute the contrastive loss for each anchor and return the mean loss over all 2N anchors. Use the log sum exp trick for numerical stability when computing the log of the sum of exponentials.