IEEE Access (Jan 2022)

Acoustic-Based Train Arrival Detection Using Convolutional Neural Networks With Attention

  • Van-Thuan Tran,
  • Wei-Ho Tsai

DOI
https://doi.org/10.1109/ACCESS.2022.3185224
Journal volume & issue
Vol. 10
pp. 72120 – 72131

Abstract

Read online

In the places of railroad crossings, audible warning signals such as train whistles and railway alarms are utilized to warn the road users of paying attention and giving priority to the approaching train(s). However, road users may sometimes be unaware of warning signals due to various reasons, resulting in inappropriate cooperation or even traffic collision between railway vehicles and non-railway vehicles. This work studies deep learning-based approaches to develop systems for acoustic-based train arrival detection (A-TAD). Firstly, we develop a novel audio dataset of train horns, railway alarms, railway noise, and other urban noises to conduct A-TAD experiments. We then examine the efficiency of handcrafted acoustic features (i.e. MFCC and Mel-spectrogram) in building A-TAD’s audio classifier, the MSNet, which is based on two-dimensional convolutional neural networks (2D-CNN). Next, we propose to apply the attention mechanism and utilize MFCC and spectrogram simultaneously to enhance the classification accuracy, in which the combined use of acoustic features is considered at the input level (with InCom-TADNet), high-level feature level (with FCCom-TADNet), and decision level (with DLCom-TADNet). Our experiments have shown the efficiency of MSNet and attention mechanism as the MSNet trained with the single feature is more performant than the baseline models and applying attention modules results in better accuracies. Also, the combined use of MFCC and spectrogram significantly improve the system’s accuracy and robustness. A-TAD systems can be utilized to extend the safety function of the railway crossing systems, private cars, and self-driving cars, and particularly be useful for hearing-impaired road users.

Keywords