An ensemble model for idioms and literal text classification using knowledge-enabled BERT in deep learning

S. Abarna; J.I. Sheeba; S. Pradeep Devaneyan

Measurement: Sensors (Dec 2022)

An ensemble model for idioms and literal text classification using knowledge-enabled BERT in deep learning

S. Abarna,
J.I. Sheeba,
S. Pradeep Devaneyan

Affiliations

S. Abarna: Department of Computer Science and Engineering, Puducherry Technological University, India; Corresponding author.
J.I. Sheeba: Department of Computer Science and Engineering, Puducherry Technological University, India
S. Pradeep Devaneyan: Department of Mechanical Engineering, Sri Venkateshwaraa College of Engineering and Technology, Puducherry, India

Journal volume & issue: Vol. 24
p. 100434

Abstract

Read online

Literal and metaphorical meanings can both be found in language as a system of communication. The literal sense is not difficult, but the figurative sense includes ideas like metaphors, similes, proverbs, and idioms to create a distinctive impact or imaginative description. Idioms are phrases whose meaning differs from that of the words that make up the phrase. Due to its non-compositional character, idiom detection in NLP tasks like text categorization is a significant difficulty. Inaccurate idiom recognition has reduced the model's performance in one of the crucial text categorization tasks, such as cyberbullying and sentiment analysis. Using language representation models that have already been trained, such as BERT (Bidirectional Encoder Representation from Transformer) and RoBERTa, the current system categorises the phrases as literals or idioms (Robustly Optimised BERT Pretraining Approach). The current system performs more accurately than the baseline models. We propose a method for categorising idioms and literals is developed utilizing K-BERT (Knowledge-enabled BERT), a Deep Learning algorithm that injects knowledge-graphs (KGs) into the sentences as domain knowledge. Additionally, it will be ensembled utilizing the stacking ensemble approach with baseline models like BERT and RoBERTa. Trofi Metaphor dataset was utilised in this study for the model's training, while a brand-new internal dataset was used for testing.

Published in Measurement: Sensors

ISSN: 2665-9174 (Online)
Publisher: Elsevier
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electric apparatus and materials. Electric circuits. Electric networks
Website: https://www.journals.elsevier.com/measurement-sensors

About the journal

Abstract

Keywords