Jisuanji kexue yu tansuo (May 2020)
Research on Named Entity Recognition Technology in Military Software Testing
Abstract
Named entity recognition is an important stage in the construction of knowledge graph. Based on the national military standard and software testing documents, the entity type classification and the data set construction and labeling are completed. In the field of software testing, aiming at the problem that the character and word joint entity recognition method has low recognition precision, the character level feature extraction method is improved, and the CWA-BiLSTM-CRF (character and word attention- bi-directional long short term memory-conditional random field) recognition framework is proposed. The framework consists of two parts: the first part constructs a pre-trained word fusion dictionary, inputs the words and characters together to the bi-directional long short term memory network for training, and adds attention mechanism to measure the semantic contribution of each character in the word to extract the character-level features; the second part, the character-level features and word vectors are spliced, input to the bi-directional long short term memory network for training, and then through the conditional random field to solve the problem of unreasonable sequence of label results, the entities in the text are identified. The experimental results are compared with 3 commonly used deep learning character-level feature extraction methods. Both accuracy and recall rates are improved, and the optimal F1 value is 88.93%. Experiments show that the improved method is suitable for the named entity recognition task in the military software testing field, which lays the foundation for the next construction of the knowledge graph.
Keywords