Scientific Reports (Mar 2024)

A Chinese named entity recognition model incorporating recurrent cell and information state recursion

  • Qingbin Han,
  • Jialin Ma

DOI
https://doi.org/10.1038/s41598-024-56166-3
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Chinese is characterized by high syntactic complexity, chaotic annotation granularity, and slow convergence. Joint learning models can effectively improve the accuracy of Chinese Named Entity Recognition (NER), but they focus too much on local feature information and reduce the ability of long sequence feature extraction. To address the limitations of long sequence feature extraction ability, we propose a Chinese NER model called Incorporating Recurrent Cell and Information State Recursion (IRCSR-NER). The model integrates recurrent cells and information state recursion to improve the recognition ability of long entity boundaries. To solve the problem that Chinese and English have different focuses in syntactic analysis. We use the syntactic dependency approach to add lexical relationship information to sentences represented at the word level. The IRCSR-NER is applied to sequence feature extraction to improve the model efficiency and long-text feature extraction ability. The model captures contextual long-distance dependent information while focusing on local feature information. We evaluated our proposed model using four public datasets and compared it with other mainstream models. Experimental results demonstrate that our model outperforms traditional and mainstream models.