International Journal of Crowd Science (Jun 2022)

Leveraging Integrated Learning for Open-Domain Chinese Named Entity Recognition

  • Jin Diao,
  • Zhangbing Zhou,
  • Guangli Shi

DOI
https://doi.org/10.26599/IJCS.2022.9100015
Journal volume & issue
Vol. 6, no. 2
pp. 74 – 79

Abstract

Read online

Named entity recognition (NER) is a fundamental technique in natural language processing that provides preconditions for tasks, such as natural language question reasoning, text matching, and semantic text similarity. Compared to English, the challenge of Chinese NER lies in the noise impact caused by the complex meanings, diverse structures, and ambiguous semantic boundaries of the Chinese language itself. At the same time, compared with specific domains, open-domain entity types are more complex and changeable, and the number of entities is considerably larger. Thus, the task of Chinese NER is more difficult. However, existing open-domain NER methods have low recognition rates. Therefore, this paper proposes a method based on the bidirectional long short-term memory conditional random field (BiLSTM-CRF) model, which leverages integrated learning to improve the efficiency of Chinese NER. Compared with single models, including CRF, BiLSTM-CRF, and gated recurrent unit-CRF, the proposed method can significantly improve the accuracy of open-domain Chinese NER.

Keywords