SuperConText: Supervised Contrastive Learning Framework for Textual Representations

Youness Moukafih; Nada Sbihi; Mounir Ghogho; Kamel Smaili

doi:10.1109/ACCESS.2023.3241490

IEEE Access (Jan 2023)

SuperConText: Supervised Contrastive Learning Framework for Textual Representations

Youness Moukafih,
Nada Sbihi,
Mounir Ghogho,
Kamel Smaili

Affiliations

Youness Moukafih: ORCiD; TIC Lab, College of Engineering and Architecture, Université Internationale de Rabat, Salé, Morocco
Nada Sbihi: TIC Lab, College of Engineering and Architecture, Université Internationale de Rabat, Salé, Morocco
Mounir Ghogho: ORCiD; TIC Lab, College of Engineering and Architecture, Université Internationale de Rabat, Salé, Morocco
Kamel Smaili: Loria, Campus Scientifique, Vandoeuvre-lés-Nancy, France

DOI: https://doi.org/10.1109/ACCESS.2023.3241490
Journal volume & issue: Vol. 11
pp. 16820 – 16830

Abstract

Read online

In the last decade, Deep neural networks (DNNs) have been proven to outperform conventional machine learning models in supervised learning tasks. Most of these models are typically optimized by minimizing the well-known Cross-Entropy objective function. The latter, however, has a number of drawbacks, including poor margins and instability. Taking inspiration from the recent self-supervised Contrastive representation learning approaches, we introduce Supervised Contrastive learning framework for Textual representations (SuperConText) to address those issues. We pretrain a neural network by minimizing a novel fully-supervised contrastive loss. The goal is to increase both inter-class separability and intra-class compactness of the embeddings in the latent space. Examples belonging to the same class are regarded as positive pairs, while examples belonging to different classes are considered negatives. Further, we propose a simple yet effective method for selecting hard negatives during the training phase. In extensive series of experiments, we study the impact of a number of parameters on the quality of the learned representations (e.g. the batch size). Simulation results show that the proposed solution outperforms several competing approaches on various large-scale text classification benchmarks without requiring specialized architectures, data augmentations, memory banks, or additional unsupervised data. For instance, we achieved top-1 accuracy of 61.94% on the Amazon-F dataset, which is 3.54% above the best result obtained when using the cross-entropy with the same model architecture.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords