IEEE Access (Jan 2020)

Improving Distantly-Supervised Named Entity Recognition for Traditional Chinese Medicine Text via a Novel Back-Labeling Approach

  • Dezheng Zhang,
  • Chao Xia,
  • Cong Xu,
  • Qi Jia,
  • Shibing Yang,
  • Xiong Luo,
  • Yonghong Xie

DOI
https://doi.org/10.1109/ACCESS.2020.3015056
Journal volume & issue
Vol. 8
pp. 145413 – 145421

Abstract

Read online

Recent advances in deep neural networks (DNNs) have enabled us to achieve reliable named entity recognition (NER) models without handcrafting features. However, these are also some obstacles imposed by using those machine learning methods, in need of a large amount of manually labeled data. To avoid such limitations, we could replace human annotation with distant supervision, however there remain a technical challenge on the error label issue caused by ignoring the entities that are not included in the vocabulary, which should be addressed to achieve the effective NER model. Then, we propose a novel back-labeling approach and integrate it into a tagging scheme, especially, we apply this scheme to handle the NER task in traditional Chinese medicine (TCM) field. In addition, we discuss how to use distant supervision methods to achieve better performance of the NER model. We conduct some experiments and verify that our scheme can effectively improve the entity recognition on the basis of distant supervision.

Keywords