Aerospace (Oct 2024)
Hazard Analysis for Massive Civil Aviation Safety Oversight Reports Using Text Classification and Topic Modeling
Abstract
There are massive amounts of civil aviation safety oversight reports collected each year in the civil aviation of China. The narrative texts of these reports are typically short texts, recording the abnormal events detected during the safety oversight process. In the construction of an intelligent civil aviation safety oversight system, the automatic classification of safety oversight texts is a key and fundamental task. However, all safety oversight reports are currently analyzed and classified into categories by manual work, which is time consuming and labor intensive. In recent years, pre-trained language models have been applied to various text mining tasks and have proven to be effective. The aim of this paper is to apply text classification to the mining of these narrative texts and to show that text classification technology can be a critical element of the aviation safety oversight report analysis. In this paper, we propose a novel method for the classification of narrative texts in safety oversight reports. Through extensive experiments, we validated the effectiveness of all the proposed components. The experimental results demonstrate that our method outperforms existing methods on the self-built civil aviation safety oversight dataset. This study undertakes a thorough examination of the precision and associated outcomes of the dataset, thereby establishing a solid basis for furnishing valuable insights to enhance data quality and optimize information.
Keywords