Lexicon‐based fine‐tuning of multilingual language models for low‐resource language sentiment analysis

Vinura Dhananjaya; Surangika Ranathunga; Sanath Jayasena

doi:10.1049/cit2.12333

CAAI Transactions on Intelligence Technology (Oct 2024)

Lexicon‐based fine‐tuning of multilingual language models for low‐resource language sentiment analysis

Vinura Dhananjaya,
Surangika Ranathunga,
Sanath Jayasena

Affiliations

Vinura Dhananjaya: Department of Computer Science and Engineering University of Moratuwa Moratuwa Sri Lanka
Surangika Ranathunga: Department of Computer Science and Engineering University of Moratuwa Moratuwa Sri Lanka
Sanath Jayasena: Department of Computer Science and Engineering University of Moratuwa Moratuwa Sri Lanka

DOI: https://doi.org/10.1049/cit2.12333
Journal volume & issue: Vol. 9, no. 5
pp. 1116 – 1125

Abstract

Read online

Abstract Pre‐trained multilingual language models (PMLMs) such as mBERT and XLM‐R have shown good cross‐lingual transferability. However, they are not specifically trained to capture cross‐lingual signals concerning sentiment words. This poses a disadvantage for low‐resource languages (LRLs) that are under‐represented in these models. To better fine‐tune these models for sentiment classification in LRLs, a novel intermediate task fine‐tuning (ITFT) technique based on a sentiment lexicon of a high‐resource language (HRL) is introduced. The authors experiment with LRLs Sinhala, Tamil and Bengali for a 3‐class sentiment classification task and show that this method outperforms vanilla fine‐tuning of the PMLM. It also outperforms or is on‐par with basic ITFT that relies on an HRL sentiment classification dataset.

Published in CAAI Transactions on Intelligence Technology

ISSN: 2468-2322 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Language and Literature: Philology. Linguistics: Computational linguistics. Natural language processing; Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software
Website: https://ietresearch.onlinelibrary.wiley.com/journal/24682322

About the journal

Abstract

Keywords