Sampling Informative Positives Pairs in Contrastive Learning

Abstract

Contrastive Learning is a paradigm for learning representation functions that recover useful similarity structure in a dataset based on samples of positive (similar) and negative (dissimilar) instances. The quality of the learned representations depends crucially on the degree to which the strategies for sampling positive and negative instances reflect useful structure in the data. Typically, positive instances are sampled by randomly perturbing an anchor point using some form of data augmentation. However, not all randomly sampled positive instances are equally effective. In this paper, we analyze strategies for sampling more effective positive instances. We consider a setting where class structure in the observed data derives from analogous structure in an unobserved latent space. We propose active sampling approaches for positive instances and investigate their role in effectively learning representation functions which recover the class structure in the underlying latent space

Publication
Sampling Theory and Applications