IEEE Access (Jan 2023)
SuperConText: Supervised Contrastive Learning Framework for Textual Representations
Abstract
In the last decade, Deep neural networks (DNNs) have been proven to outperform conventional machine learning models in supervised learning tasks. Most of these models are typically optimized by minimizing the well-known Cross-Entropy objective function. The latter, however, has a number of drawbacks, including poor margins and instability. Taking inspiration from the recent self-supervised Contrastive representation learning approaches, we introduce Supervised Contrastive learning framework for Textual representations (SuperConText) to address those issues. We pretrain a neural network by minimizing a novel fully-supervised contrastive loss. The goal is to increase both inter-class separability and intra-class compactness of the embeddings in the latent space. Examples belonging to the same class are regarded as positive pairs, while examples belonging to different classes are considered negatives. Further, we propose a simple yet effective method for selecting hard negatives during the training phase. In extensive series of experiments, we study the impact of a number of parameters on the quality of the learned representations (e.g. the batch size). Simulation results show that the proposed solution outperforms several competing approaches on various large-scale text classification benchmarks without requiring specialized architectures, data augmentations, memory banks, or additional unsupervised data. For instance, we achieved top-1 accuracy of 61.94% on the Amazon-F dataset, which is 3.54% above the best result obtained when using the cross-entropy with the same model architecture.
Keywords