IEEE Access (Jan 2024)
Sound Source Localization in Spherical Harmonics Domain Based on High-Order Ambisonics Signals Enhancement Neural Network
Abstract
Compared with methods based on microphone arrays, sound source localization methods based on higher-order ambisonics (HOA) signals are no longer limited to specific array structures and exhibit better performance in multi-source scenarios. However, the estimation errors of HOA signals always limit the available frequency band of localization algorithms, leading to a decrease in the accuracy and robustness of localization algorithms. To address this problem, we propose a sound source localization method that combines an HOA signals enhancement neural network (referred to as the network model). This method uses a convolutional neural network (CNN) to eliminate low-frequency noise and high-frequency aliasing errors in the HOA signals. It enhances the noise resistance of the network model by adding noise interference during the training. Because the network model improves the consistency of spatial features in each frequency band of the HOA signals, we directly used the full-band frequency smoothing algorithm to improve the accuracy of the covariance matrix and combined it with the minimum variance distortionless response algorithm in the eigenbeam domain (i.e., spherical harmonics domain) (EB-MVDR) for sound source localization. Experimental results show that compared with the traditional EB-MVDR, the proposed sound source localization method can effectively improve the accuracy of multi-source localization in noisy and reverberant environments and has good performance under different numbers of sound sources.
Keywords