Measurement: Sensors (Dec 2022)

An ensemble model for idioms and literal text classification using knowledge-enabled BERT in deep learning

  • S. Abarna,
  • J.I. Sheeba,
  • S. Pradeep Devaneyan

Journal volume & issue
Vol. 24
p. 100434

Abstract

Read online

Literal and metaphorical meanings can both be found in language as a system of communication. The literal sense is not difficult, but the figurative sense includes ideas like metaphors, similes, proverbs, and idioms to create a distinctive impact or imaginative description. Idioms are phrases whose meaning differs from that of the words that make up the phrase. Due to its non-compositional character, idiom detection in NLP tasks like text categorization is a significant difficulty. Inaccurate idiom recognition has reduced the model's performance in one of the crucial text categorization tasks, such as cyberbullying and sentiment analysis. Using language representation models that have already been trained, such as BERT (Bidirectional Encoder Representation from Transformer) and RoBERTa, the current system categorises the phrases as literals or idioms (Robustly Optimised BERT Pretraining Approach). The current system performs more accurately than the baseline models. We propose a method for categorising idioms and literals is developed utilizing K-BERT (Knowledge-enabled BERT), a Deep Learning algorithm that injects knowledge-graphs (KGs) into the sentences as domain knowledge. Additionally, it will be ensembled utilizing the stacking ensemble approach with baseline models like BERT and RoBERTa. Trofi Metaphor dataset was utilised in this study for the model's training, while a brand-new internal dataset was used for testing.

Keywords