Towards solving NLP tasks with optimal transport loss

Rishabh Bhardwaj; Tushar Vaidya; Soujanya Poria

Journal of King Saud University: Computer and Information Sciences (Nov 2022)

Towards solving NLP tasks with optimal transport loss

Rishabh Bhardwaj,
Tushar Vaidya,
Soujanya Poria

Affiliations

Rishabh Bhardwaj: Information Systems Technology and Design, Singapore University of Technology and Design, Singapore
Tushar Vaidya: Information Systems Technology and Design, Singapore University of Technology and Design, Singapore
Soujanya Poria: Corresponding author.; Information Systems Technology and Design, Singapore University of Technology and Design, Singapore

Journal volume & issue: Vol. 34, no. 10
pp. 10434 – 10443

Abstract

Read online

Loss functions are essential to computing the divergence of a model’s predicted distribution from the ground truth. Such functions play a vital role in machine learning algorithms as they steer the learning process. Most common loss functions in natural language processing (NLP), such as Kullback–Leibler (KL) and Jensen–Shannon (JS) divergences, do not base their computations on the properties of label coordinates. Label coordinates can help encode the inter-label relationships. For the sentiment classification task, strongly positive sentiment is closer to positive than strongly negative sentiment. Incorporating such information in the computations of the probability divergence can facilitate the model’s learning dynamics.In this work, we study an under-explored loss function in NLP — Wasserstein Optimal Transport (OT) — which takes label coordinates into account and thus allows the learning algorithm to incorporate inter-label relations. However, the limited applications of OT-based loss owe to the challenges in defining quality label coordinates. We explore the current limitations of learning with OT and provide an algorithm that jointly learns label coordinates with the model parameters. We show the efficacy of OT on several text classification tasks such as sentiment analysis and emotion recognition in conversation. We also discuss the limitations of the approach. The source codes pertaining to this work are publicly available at: https://github.com/declare-lab/NLP-OT.

Published in Journal of King Saud University: Computer and Information Sciences

ISSN: 1319-1578 (Print)
Publisher: Elsevier
Country of publisher: Saudi Arabia
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://www.journals.elsevier.com/journal-of-king-saud-university-computer-and-information-sciences/

About the journal

Abstract

Keywords