IEEE Access (Jan 2023)

Extracting Fallen Objects on the Road From Accident Reports Using a Natural Language Processing Model-Based Approach

  • Seung-Seok Lee,
  • So-Mi Cha,
  • Bonggyun Ko,
  • Je Jin Park

DOI
https://doi.org/10.1109/ACCESS.2023.3339774
Journal volume & issue
Vol. 11
pp. 139521 – 139533

Abstract

Read online

Keyword extraction is an effective way to quickly identify key elements in text. It can accelerate the identification of key factors that play a role in accidents when applied to incident report analysis. Our research presents an innovative process for extracting keywords from incident reports with the pre-trained natural language processing models. We utilized fine-tuning techniques to integrate a BiLSTM-CRF with a fully-connected layer and pre-trained natural language models. The process of extracting keyphrases is approached as a task of labeling sequences. To analyze incident reports from Korea, we employ pre-trained models customized for the Korean context, such as KoBERT and KoELECTRA. Our approach is assessed using a range of metrics, including accuracy, area under the curve (AUC), F1-score, slot error rate (SER), and simple matching coefficient (SMC). In contrast to traditional approaches which mainly concentrate on document summarization, our research provides a distinct method tailored to identifying falling objects as the main cause of accidents. Our findings demonstrate that the ELECTRA-based model with a BiLSTM-CRF outperforms other models, achieving an accuracy of 0.943, an AUC of 0.991, and a low SER of 0.075. The F1-score and SMC closely resemble the BERT-based model with a BiLSTM-CRF, with no significant differences observed within the 95% confidence interval. These results underscore the potential of fine-tuning pre-trained models for post-hoc traffic accident analysis. This method offers a swift preliminary step to identify the key factors before human analysis, presenting a multifaceted strategy to enhance road safety and prevent accidents.

Keywords