Information (Jan 2022)

Learning the Morphological and Syntactic Grammars for Named Entity Recognition

  • Mengtao Sun,
  • Qiang Yang,
  • Hao Wang,
  • Mark Pasquine,
  • Ibrahim A. Hameed

DOI
https://doi.org/10.3390/info13020049
Journal volume & issue
Vol. 13, no. 2
p. 49

Abstract

Read online

In some languages, Named Entity Recognition (NER) is severely hindered by complex linguistic structures, such as inflection, that will confuse the data-driven models when perceiving the word’s actual meaning. This work tries to alleviate these problems by introducing a novel neural network based on morphological and syntactic grammars. The experiments were performed in four Nordic languages, which have many grammar rules. The model was named the NorG network (Nor: Nordic Languages, G: Grammar). In addition to learning from the text content, the NorG network also learns from the word writing form, the POS tag, and dependency. The proposed neural network consists of a bidirectional Long Short-Term Memory (Bi-LSTM) layer to capture word-level grammars, while a bidirectional Graph Attention (Bi-GAT) layer is used to capture sentence-level grammars. Experimental results from four languages show that the grammar-assisted network significantly improves the results against baselines. We also investigate how the NorG network works on each grammar component by some exploratory experiments.

Keywords