Applied Sciences (Jul 2022)

Chinese Named Entity Recognition of Geological News Based on BERT Model

  • Chao Huang,
  • Yuzhu Wang,
  • Yuqing Yu,
  • Yujia Hao,
  • Yuebin Liu,
  • Xiujian Zhao

DOI
https://doi.org/10.3390/app12157708
Journal volume & issue
Vol. 12, no. 15
p. 7708

Abstract

Read online

With the ongoing progress of geological survey work and the continuous accumulation of geological data, extracting accurate information from massive geological data has become increasingly difficult. To fully mine and utilize geological data, this study proposes a geological news named entity recognition (GNNER) method based on the bidirectional encoder representations from transformers (BERT) pre-trained language model. This solves the problems of traditional word vectors that are difficult to represent context semantics and the single extraction effect and can also help construct the knowledge graphs of geological news. First, the method uses the BERT pre-training model to embed words in the geological news text, and the dynamically obtained word vector is used as the model’s input. Second, the word vector is sent to a bidirectional long short-term memory model for further training to obtain contextual features. Finally, the corresponding six entity types are extracted using conditional random field sequence decoding. Through experiments on the constructed Chinese geological news dataset, the average F1 score identified by the model is 0.839. The experimental results show that the model can better identify news entities in geological news.

Keywords