Feature Aggregation in Joint Sound Classification and Localization Neural Networks

Brendan Healy; Patrick Mcnamee; Zahra Nili Ahmadabadi

doi:10.1109/ACCESS.2024.3438947

IEEE Access (Jan 2024)

Feature Aggregation in Joint Sound Classification and Localization Neural Networks

Brendan Healy,
Patrick Mcnamee,
Zahra Nili Ahmadabadi

Affiliations

Brendan Healy: Department of Mechanical Engineering, San Diego State University, San Diego, CA, USA
Patrick Mcnamee: ORCiD; Department of Mechanical Engineering, San Diego State University, San Diego, CA, USA
Zahra Nili Ahmadabadi: ORCiD; Department of Mechanical Engineering, San Diego State University, San Diego, CA, USA

DOI: https://doi.org/10.1109/ACCESS.2024.3438947
Journal volume & issue: Vol. 12
pp. 109157 – 109170

Abstract

Read online

Current state-of-the-art sound source localization (SSL) deep learning networks lack feature aggregation within their architecture. Feature aggregation within neural network architectures enhances model performance by enabling the consolidation of information from different feature scales, thereby improving feature robustness and invariance. We adapt feature aggregation sub-architectures from computer vision neural networks onto a baseline neural network architecture for SSL, the Sound Event Localization and Detection network (SELDnet). The incorporated sub-architecture are: Path Aggregation Network (PANet); Weighted Bi-directional Feature Pyramid Network (BiFPN); and a novel Scale Encoding Network (SEN). These sub-architectures were evaluated using two metrics for signal classification and two metrics for direction-of-arrival regression. The results show that models incorporating feature aggregations outperformed the baseline SELDnet, in both sound signal classification and localization. Among the feature aggregators, PANet exhibited superior performance compared to other methods, which were otherwise comparable. The results provide evidence that feature aggregation sub-architectures enhance the performance of sound detection neural networks, particularly in direction-of-arrival regression.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords