Robust Auditory Functions Based on Probabilistic Integration of MUSIC and CGMM

Yoshiaki Bando; Yoshiki Masuyama; Yoko Sasaki; Masaki Onishi

doi:10.1109/ACCESS.2021.3064305

IEEE Access (Jan 2021)

Robust Auditory Functions Based on Probabilistic Integration of MUSIC and CGMM

Yoshiaki Bando,
Yoshiki Masuyama,
Yoko Sasaki,
Masaki Onishi

Affiliations

Yoshiaki Bando: ORCiD; National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
Yoshiki Masuyama: ORCiD; National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
Yoko Sasaki: ORCiD; National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
Masaki Onishi: National Institute of Advanced Industrial Science and Technology, Tokyo, Japan

DOI: https://doi.org/10.1109/ACCESS.2021.3064305
Journal volume & issue: Vol. 9
pp. 38718 – 38730

Abstract

Read online

Sound source localization and separation are essential functions for robot audition to comprehend acoustic environments. The widely-used multiple signal classification (MUSIC) can precisely estimate the directions of arrival (DoAs) of multiple sound sources if its hyperparameters are selected appropriately depending on the surrounding environment. A popular separation method based on a complex Gaussian mixture model (CGMM), on the other hand, can extract multiple sources even in noisy environments if its latent variables are properly initialized to avoid bad local optima. To overcome the drawbacks of both the MUSIC and CGMM, we propose a robot audition framework that complementarily combines the MUSIC and CGMM in a probabilistic manner. Our method is based on a variant of the CGMM conditioned by the localization results of MUSIC. The hyperparameters of MUSIC are estimated by the type II maximum likelihood estimation of the CGMM, and the CGMM itself is efficiently initialized and regularized by using the localization results of MUSIC. Experimental results show that our method outperformed conventional localization and separation methods even when the number of sound sources is unknown. we also demonstrate that our method can work even with moving sound sources in real time.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords