IEEE Access (Jan 2020)
BERT-Based Chinese Relation Extraction for Public Security
Abstract
The past few years have witnessed some public safety incidents occurring around the world. With the advent of the big data era, effectively extracting public security information from the internet has become of great significance. Up to hundreds of TBs of data are injected into the network every second, and thus it is impossible to process them manually. Natural Language Processing (NLP) is dedicated to the development of an intelligent system for effective text information mining. By analysing the text and quickly extracting the relationships between the relevant entities, NLP can establish the knowledge graph (KG) of public security, which lays the foundation for safety case analysis, information monitoring, and activity tracking and locating. One of the current pre-training relation extraction models is the Word2Vec model. The Word2vec model is single mapped, and it produces a static, single representation of the words in sentences. Then, the BERT model considers contextual information and provides more dynamic, richer vector representations of generated words. Therefore, in this paper, we propose a Bidirectional Encoder Representation from Transformers (BERT) based on the Chinese relation extraction algorithm for public security, which can effectively mine security information. The BERT model is obtained by training the Masked Language Model and predicting the next sentence task, which is based on the Transformer Encoder and the main model structure is the stacked Transformers. Extensive simulations are conducted to evaluate our proposed algorithm in comparison to some state-of-the-art schemes.
Keywords