IEEE Access (Jan 2023)
SELD U-Net: Joint Optimization of Sound Event Localization and Detection With Noise Reduction
Abstract
Sound event localization and detection (SELD) is a combined task that classifies acoustic events from audio signals, estimates temporal boundaries, and identifies event locations. With the advancement of industries utilizing audio signals, SELD has been applied in various fields, and deep-learning-based research is being conducted for its effective application. However, current deep-learning-based SELD research focuses mainly on performance improvement in noise-free environments, which leads to performance degradation issues in noisy environments. To address this problem, this study proposes a robust SELD U-Net model that performs SELD in noisy environments. The proposed model combines a U-Net to remove noise and a SELDnet to perform SELD. The proposed model was trained and evaluated using noisy environmental data with various sizes. Consequently, it was confirmed that the proposed model has superior performance compared with existing deep learning-based SELD models in environments with high levels of noise.
Keywords