Transactions of the Association for Computational Linguistics (Jan 2021)

Idiomatic Expression Identification using Semantic Compatibility

  • Ziheng Zeng,
  • Suma Bhat

DOI
https://doi.org/10.1162/tacl_a_00442
Journal volume & issue
Vol. 9
pp. 1546 – 1562

Abstract

Read online

AbstractIdiomatic expressions are an integral part of natural language and constantly being added to a language. Owing to their non-compositionality and their ability to take on a figurative or literal meaning depending on the sentential context, they have been a classical challenge for NLP systems. To address this challenge, we study the task of detecting whether a sentence has an idiomatic expression and localizing it when it occurs in a figurative sense. Prior research for this task has studied specific classes of idiomatic expressions offering limited views of their generalizability to new idioms. We propose a multi-stage neural architecture with attention flow as a solution. The network effectively fuses contextual and lexical information at different levels using word and sub-word representations. Empirical evaluations on three of the largest benchmark datasets with idiomatic expressions of varied syntactic patterns and degrees of non-compositionality show that our proposed model achieves new state-of-the-art results. A salient feature of the model is its ability to identify idioms unseen during training with gains from 1.4% to 30.8% over competitive baselines on the largest dataset.