ISPRS International Journal of Geo-Information (Sep 2023)

ChineseCTRE: A Model for Geographical Named Entity Recognition and Correction Based on Deep Neural Networks and the BERT Model

  • Wei Zhang,
  • Jingtao Meng,
  • Jianhua Wan,
  • Chengkun Zhang,
  • Jiajun Zhang,
  • Yuanyuan Wang,
  • Liuchang Xu,
  • Fei Li

DOI
https://doi.org/10.3390/ijgi12100394
Journal volume & issue
Vol. 12, no. 10
p. 394

Abstract

Read online

Social media is widely used to share real-time information and report accidents during natural disasters. Named entity recognition (NER) is a fundamental task of geospatial information applications that aims to extract location names from natural language text. As a result, the identification of location names from social media information has gradually become a demand. Named entity correction (NEC), as a complementary task of NER, plays a crucial role in ensuring the accuracy of location names and further improving the accuracy of NER. Despite numerous methods having been adopted for NER, including text statistics-based and deep learning-based methods, there has been limited research on NEC. To address this gap, we propose the CTRE model, which is a geospatial named entity recognition and correction model based on the BERT model framework. Our approach enhances the BERT model by introducing incremental pre-training in the pre-training phase, significantly improving the model’s recognition accuracy. Subsequently, we adopt the pre-training fine-tuning mode of the BERT base model and extend the fine-tuning process, incorporating a neural network framework to construct the geospatial named entity recognition model and geospatial named entity correction model, respectively. The BERT model utilizes data augmentation of VGI (volunteered geographic information) data and social media data for incremental pre-training, leading to an enhancement in the model accuracy from 85% to 87%. The F1 score of the geospatial named entity recognition model reaches an impressive 0.9045, while the precision of the geospatial named entity correction model achieves 0.9765. The experimental results robustly demonstrate the effectiveness of our proposed CTRE model, providing a reference for subsequent research on location names.

Keywords