IEEE Access (Jan 2019)

Automatic Detection of Cry Sounds in Neonatal Intensive Care Units by Using Deep Learning and Acoustic Scene Simulation

  • Marco Severini,
  • Daniele Ferretti,
  • Emanuele Principi,
  • Stefano Squartini

DOI
https://doi.org/10.1109/ACCESS.2019.2911427
Journal volume & issue
Vol. 7
pp. 51982 – 51993

Abstract

Read online

Cry detection is an important facility in both residential and public environments, which can answer to different needs of both private and professional users. In this paper, we investigate the problem of cry detection in professional environments, such as Neonatal Intensive Care Units (NICUs). The aim of our work is to propose a cry detection method based on deep neural networks (DNNs) and also to evaluate whether a properly designed synthetic dataset can replace on-field acquired data for training the DNN-based cry detector. In this way, a massive data collection campaign in NICUs can be avoided, and the cry detector can be easily retargeted to different NICUs. The paper presents different solutions based on single-channel and multi-channel DNNs. The experimental evaluation is conducted on the synthetic dataset created by simulating the acoustic scene of a real NICU, and on a real dataset containing audio acquired on the same NICU. The evaluation revealed that using real data in the training phase allows achieving the overall highest performance, with an Area Under Precision-Recall Curve (PRC-AUC) equal to 87.28%, when signals are processed with a beamformer and a post-filter and a single-channel DNN is used. The same method, however, reduces the performance to 70.61% when training is performed on the synthetic dataset. On the contrary, under the same conditions, the new single-channel architecture introduced in this paper achieves the highest performance with a PRC-AUC equal to 80.48%, thus proving that the acoustic scene simulation strategy can be used to train a cry detection method with positive results.

Keywords