Named Entity Recognition Model Based on k-best Viterbi Decoupling Knowledge Distillation

ZHAO Honglei, TANG Huanling, ZHANG Yu, SUN Xueyuan, LU Mingyu

doi:10.3778/j.issn.1673-9418.2211052

Jisuanji kexue yu tansuo (Mar 2024)

Named Entity Recognition Model Based on k-best Viterbi Decoupling Knowledge Distillation

ZHAO Honglei, TANG Huanling, ZHANG Yu, SUN Xueyuan, LU Mingyu

Affiliations

ZHAO Honglei, TANG Huanling, ZHANG Yu, SUN Xueyuan, LU Mingyu: 1. School of Information and Electronic Engineering, Shandong Technology and Business University, Yantai, Shandong 264005, China 2. School of Computer Science and Technology, Shandong Technology and Business University, Yantai, Shandong 264005, China 3. Co-innovation Center of Shandong Colleges and Universities：Future Intelligent Computing, Yantai, Shandong 264005, China 4. Key Laboratory of Intelligent Information Processing in Universities of Shandong (Shandong Technology and Business University), Yantai, Shandong 264005, China 5. Information Science and Technology College, Dalian Maritime University, Dalian, Liaoning 116026, China

DOI: https://doi.org/10.3778/j.issn.1673-9418.2211052
Journal volume & issue: Vol. 18, no. 3
pp. 780 – 794

Abstract

Read online

Knowledge distillation is a general approach to improve the performance of the named entity recognition (NER) models. However, the classical knowledge distillation loss functions are coupled, which leads to poor logit distillation. In order to decouple and effectively improve the performance of logit distillation, this paper proposes an approach, k-best Viterbi decoupling knowledge distillation (kvDKD), which combines k-best Viterbi decoding to improve the computational efficiency, effectively improving the model performance. Additionally, the NER based on deep learning is easy to introduce noise in data augmentation. Therefore, a data augmentation method combining data filtering and entity rebalancing algorithm is proposed, aiming to reduce noise introduced by the original dataset and to enhance the problem of mislabeled data, which can improve the quality of data and reduce overfitting. Based on the above method, a novel named entity recognition model NER-kvDKD (named entity recognition model based on k-best Viterbi decoupling knowledge distillation) is proposed. The comparative experimental results on the datasets of MSRA, Resume, Weibo, CLUENER and CoNLL-2003 show that the proposed method can improve the generalization ability of the model and also effectively improves the student model performance.

named entity recognition (ner); knowledge distillation; k-best viterbi decoding; data augmentation

Published in Jisuanji kexue yu tansuo

ISSN: 1673-9418 (Print)
Publisher: Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press
Country of publisher: China
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://fcst.ceaj.org

About the journal

Abstract

Keywords