IEEE Open Journal of Signal Processing (Jan 2024)

Binaural Multichannel Blind Speaker Separation With a Causal Low-Latency and Low-Complexity Approach

  • Nils L. Westhausen,
  • Bernd T. Meyer

DOI
https://doi.org/10.1109/OJSP.2023.3343320
Journal volume & issue
Vol. 5
pp. 238 – 247

Abstract

Read online

In this article, we introduce a causal low-latency low-complexity approach for binaural multichannel blind speaker separation in noisy reverberant conditions. The model, referred to as Group Communication Binaural Filter and Sum Network (GCBFSnet) predicts complex filters for filter-and-sum beamforming in the time-frequency domain. We apply Group Communication (GC), i.e., latent model variables are split into groups and processed with a shared sequence model with the aim of reducing the complexity of a simple model only containing one convolutional and one recurrent module. With GC we are able to reduce the size of the model by up to 83% and the complexity up to 73% compared to the model without GC, while mostly retaining performance. Even for the smallest model configuration, GCBFSnet matches the performance of a low-complexity TasNet baseline in most metrics despite the larger size and higher number of required operations of the baseline.

Keywords