Applied Sciences (Nov 2022)
A Novel Named Entity Recognition Algorithm for Hot Strip Rolling Based on BERT-Imseq2seq-CRF Model
Abstract
Named entity recognition is not only the first step of text information extraction, but also the key process of constructing domain knowledge graphs. In view of the large amount of text data, complex process flow and urgent application needs in the hot strip rolling process, a novel named entity recognition algorithm based on BERT-Imseq2seq-CRF model is proposed in this paper. Firstly, the algorithm uses the BERT preprocessing language model to mine the dependencies in the domain text and obtain the corresponding representation vector. Then, the representation vector is sent to the encoder layer, and the output vector is input to the decoder at the same time, on the premise that the original model only considers the semantic vector. The Teacher-Forcing mechanism is integrated into the decoder layer to randomly modify the labeling results, and error accumulation is avoided to guarantee the sequence recognition effect. Finally, the validity of the labeling results is checked according to the conditional random field constraints, and the overall labeling quality of the algorithm is improved. The experimental results show that this model can efficiently and accurately predict the physical label of hot strip rolling, and the model performance index is better than other models, with the F1-Score reaching 91.47%. This model further provides technical support for information extraction and domain knowledge graph construction of hot strip rolling.
Keywords