EURASIP Journal on Advances in Signal Processing (Sep 2022)

Deficient-basis-complementary rank-constrained spatial covariance matrix estimation based on multivariate generalized Gaussian distribution for blind speech extraction

  • Yuto Kondo,
  • Yuki Kubo,
  • Norihiro Takamune,
  • Daichi Kitamura,
  • Hiroshi Saruwatari

DOI
https://doi.org/10.1186/s13634-022-00905-z
Journal volume & issue
Vol. 2022, no. 1
pp. 1 – 24

Abstract

Read online

Abstract Rank-constrained spatial covariance matrix estimation (RCSCME) is a blind speech extraction method utilized under the condition that one-directional target speech and diffuse background noise are mixed. In this paper, we propose a new model extension of RCSCME. RCSCME simultaneously conducts both the deficient rank-1 component complementation of the diffuse noise spatial covariance matrix, which is incompletely estimated by preprocessing methods such as independent low-rank matrix analysis, and the estimation of the source model parameters. In the conventional RCSCME, between the two parameters constituting the deficient rank-1 component, only the scale is estimated, whereas the other parameter, the deficient basis, is fixed in advance; however, how to choose the fixed deficient basis is not unique. In the proposed RCSCME model, we also regard the deficient basis as a parameter to estimate. As the generative model of an observed signal, we utilized the super-Gaussian generalized Gaussian distribution, which achieves better separation performance than the Gaussian distribution in the conventional RCSCME. Assuming the model, we derive new majorization-minimization (MM)- and majorization-equalization (ME)-algorithm-based update rules for the deficient basis. In particular, among innumerable ME-algorithm-based update rules, we successfully find an ME-algorithm-based update rule with a mathematical proof supporting the fact that the step of the update rule is larger than that of the MM-algorithm-based update rule. We confirm that the proposed method outperforms conventional methods under several simulated noise conditions and a real noise condition.

Keywords