Zhongguo dianli (Jul 2023)
The Construction of the Professional Dictionary of Relay Protection Defect Text in a Regional Power Grid and Its Natural Language Characteristics Analysis
Abstract
Massive defect text data of relay protection devices is lack of data mining based on professional dictionary. It cannot provide sufficient support for grading, diagnosing, and eliminating relay protection defects, thus unable to meet efficient operation and maintenance needs. A professional dictionary construction method suitable for defects in relay protection devices is proposed, and relevant professional dictionaries are constructed taking a regional power grid as an example. Firstly, relevant defect logs and management protocols are aggregated to form a defect text corpus; secondly, a regular expression-based deactivation word identification method is applied to realize the rejection of irrelevant words in the defect text; then, a combined machine and manual method is used to build a relay protection defect text dictionary; Besides, it adopts latent semantic analysis and decision tree classification to achieve synonym merging. By integrating the deactivation word list, the split word lexicon and the synonym list, a specialized dictionary of protection device defects in the regional power grid is constructed. Finally, the Zipf distribution feature analysis of the professional dictionary and the corpus information entropy analysis before and after using the dictionary are carried out, which shows the effectiveness of the professional dictionary.
Keywords