EURASIP Journal on Audio, Speech, and Music Processing (Oct 2024)
DOA-informed switching independent vector extraction and beamforming for speech enhancement in underdetermined situations
Abstract
Abstract This paper proposes novel methods for extracting a single Speech signal of Interest (SOI) from a multichannel observed signal in underdetermined situations, i.e., when the observed signal contains more speech signals than microphones. It focuses on extracting the SOI using prior knowledge of the SOI’s Direction of Arrival (DOA). Conventional beamformers (BFs) and Blind Source Separation (BSS) with spatial regularization struggle to suppress interference speech signals in such situations. Although Switching Minimum Power Distortionless Response BF (Sw-MPDR) can handle underdetermined situations using a switching mechanism, its estimation accuracy significantly decreases when it relies on a steering vector determined by the SOI’s DOA. Spatially-Regularized Independent Vector Extraction (SRIVE) can robustly enhance the SOI based solely on its DOA using spatial regularization, but its performance degrades in underdetermined situations. This paper extends these conventional methods to overcome their limitations. First, we introduce a time-varying Gaussian (TVG) source model to Sw-MPDR to effectively enhance the SOI based solely on the DOA. Second, we introduce the switching mechanism to SRIVE to improve its speech enhancement performance in underdetermined situations. These two proposed methods are called Switching weighted MPDR (Sw-wMPDR) and Switching SRIVE (Sw-SRIVE). We experimentally demonstrate that both surpass conventional methods in enhancing the SOI using the DOA in underdetermined situations.
Keywords