SupMPN: Supervised Multiple Positives and Negatives Contrastive Learning Model for Semantic Textual Similarity

Somaiyeh Dehghan; Mehmet Fatih Amasyali

doi:10.3390/app12199659

Applied Sciences (Sep 2022)

SupMPN: Supervised Multiple Positives and Negatives Contrastive Learning Model for Semantic Textual Similarity

Somaiyeh Dehghan,
Mehmet Fatih Amasyali

Affiliations

Somaiyeh Dehghan: Department of Computer Engineering, Yildiz Technical University, Istanbul 34220, Turkey
Mehmet Fatih Amasyali: Department of Computer Engineering, Yildiz Technical University, Istanbul 34220, Turkey

DOI: https://doi.org/10.3390/app12199659
Journal volume & issue: Vol. 12, no. 19
p. 9659

Abstract

Read online

Semantic Textual Similarity (STS) is an important task in the area of Natural Language Processing (NLP) that measures the similarity of the underlying semantics of two texts. Although pre-trained contextual embedding models such as Bidirectional Encoder Representations from Transformers (BERT) have achieved state-of-the-art performance on several NLP tasks, BERT-derived sentence embeddings have been proven to collapse in some way, i.e., sentence embeddings generated by BERT depend on the frequency of words. Therefore, almost all BERT-derived sentence embeddings are mapped into a small area and have a high cosine similarity. Hence, sentence embeddings generated by BERT are not so robust in the STS task as they cannot capture the full semantic meaning of the sentences. In this paper, we propose SupMPN: A Supervised Multiple Positives and Negatives Contrastive Learning Model, which accepts multiple hard-positive sentences and multiple hard-negative sentences simultaneously and then tries to bring hard-positive sentences closer, while pushing hard-negative sentences away from them. In other words, SupMPN brings similar sentences closer together in the representation space by discrimination among multiple similar and dissimilar sentences. In this way, SupMPN can learn the semantic meanings of sentences by contrasting among multiple similar and dissimilar sentences and can generate sentence embeddings based on the semantic meaning instead of the frequency of the words. We evaluate our model on standard STS and transfer-learning tasks. The results reveal that SupMPN outperforms state-of-the-art SimCSE and all other previous supervised and unsupervised models.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords