IET Collaborative Intelligent Manufacturing (Dec 2024)
Domain‐adaptation‐based named entity recognition with information enrichment for equipment fault knowledge graph
Abstract
Abstract Numerous files, such as records and logs, are generated in the process of equipment diagnosis and maintenance (D&M). These files contain lots of unstructured plain text. Knowledge in these files could be reused for similar equipment faults. In practice, knowledge presented in plain text is hard to acquire. Thus, automated named entity recognition (NER) and relation extraction (RE) methods based on pretrained encoders could be used to extract entities and relations and develop a structured knowledge graph (KG), thus facilitating intelligent manufacturing. However, equipment fault NER exhibits suboptimal performance with existing encoders pretrained on general‐domain corpus. In this paper, domain‐adaptation‐based NER with information enrichment is proposed for developing an equipment fault KG. A domain‐adapted encoder is tailored for equipment fault NER through domain‐adaptive pretraining (DAPT). Update of word segmentation dictionary and adjustment of masking approach are implemented during DAPT for information enrichment, which helps make the most of the limited domain‐specific pretraining corpus. Experimental results show that the F1 score of NER is improved by 1.22% using the domain‐adapted encoder compared to its counterpart using the encoder pretrained on general‐domain corpus. Furthermore, a reliable and robust question answering (QA) application of the developed equipment fault KG is also shown.
Keywords