Noise Mapping (Jul 2019)

Classification and mapping of sound sources in local urban streets through AudioSet data and Bayesian optimized Neural Networks

  • Verma Deepank,
  • Jana Arnab,
  • Ramamritham Krithi

DOI
https://doi.org/10.1515/noise-2019-0005
Journal volume & issue
Vol. 6, no. 1
pp. 52 – 71

Abstract

Read online

Deep learning (DL) methods have provided several breakthroughs in conventional data analysis techniques, especially with image and audio datasets. Rapid assessment and large-scale quantification of environmental attributes have been possible through such models. This study focuses on the creation of Artificial Neural Networks (ANN) and Recurrent Neural Networks (RNN) based models to classify sound sources from manually collected sound clips in local streets. A subset of an openly available AudioSet data is used to train and evaluate the model against the common sound classes present in the urban streets. The collection of audio data is done at random locations in the selected study area of 0.2 sq. km. The audio clips are further classified according to the extent of anthropogenic (mainly traffic), natural and human-based sounds present in particular locations. Rather than the manual tuning of model hyperparameters, the study utilizes Bayesian Optimization to obtain hyperparameter values of Neural Network models. The optimized models produce an overall accuracy of 89 percent and 60 percent on the evaluation set for three and fifteen-class model respectively. The model detections are mapped in the study area with the help of the Inverse Distance Weighted (IDW) spatial interpolation method.