Comparative Assessment of Data Augmentation for Semi-Supervised Polyphonic Sound Event Detection

Lionel Delphin-Poulat; Rozenn Nicol; Cyril Plapous; Katell Peron

doi:10.23919/FRUCT49677.2020.9211023

Proceedings of the XXth Conference of Open Innovations Association FRUCT (Sep 2020)

Comparative Assessment of Data Augmentation for Semi-Supervised Polyphonic Sound Event Detection

Lionel Delphin-Poulat,
Rozenn Nicol,
Cyril Plapous,
Katell Peron

Affiliations

Lionel Delphin-Poulat: Orange, France
Rozenn Nicol: Orange, France
Cyril Plapous: Orange, France
Katell Peron: Orange, France

DOI: https://doi.org/10.23919/FRUCT49677.2020.9211023
Journal volume & issue: Vol. 27, no. 1
pp. 46 – 53

Abstract

Read online

In the context of audio ambient intelligence systems in Smart Buildings, polyphonic Sound Event Detection aims at detecting, localizing and classifying any sound event recorded in a room. Today, most of models are based on Deep Learning, requiring large databases to be trained. We propose a CRNN system exploiting unlabeled data with semi-supervised learning based on the ""Mean teacher"" method, in combination with data augmentation to overcome the limited size of the training dataset and to further improve the performances. This model was submitted to the challenge DCASE 2019 and was ranked second. In the present study, several conventional solutions of data augmentation are compared : time or frequency shifting, and background noise addition. It is shown that data augmentation with time shifting and noise addition, in combination with class-dependent median filtering, improves the performance by 9%, leading to an event-based F1-score of 43.2% with DCASE 2019 validation set. However, tools which have been used up to now for data augmentation generally rely on a coarse modelling (i.e. random variation of data) of intra-class variability observed in real life. It is wondered whether acoustic knowledge in the design of augmentation methods would be advantageous. A physics-inspired approach is outlined for future work.

Published in Proceedings of the XXth Conference of Open Innovations Association FRUCT

ISSN: 2305-7254 (Print); 2343-0737 (Online)
Publisher: FRUCT
Country of publisher: Finland
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Telecommunication
Website: http://fruct.org/publication

About the journal

Abstract

Keywords