IEEE Access (Jan 2023)

Speaker Verification Based on Single Channel Speech Separation

  • Rong Jin,
  • Mijit Ablimit,
  • Askar Hamdulla

DOI
https://doi.org/10.1109/ACCESS.2023.3287868
Journal volume & issue
Vol. 11
pp. 112631 – 112638

Abstract

Read online

In multi-speaker scenarios, speech processing tasks like speaker identification and speech recognition are susceptible to noise and overlapped voices. As the overlapped voices are a complicated mixture of signals, a target extraction method from this mixture is a good front-end solution for further processing like understanding and classifying. The quality of speech separation can be assessed by the noise ratio or subjective scoring and can also be assessed by accuracy of the downstream processing tasks like speaker identification. In order to make the separation model and speaker identification model more adapted to complex multi-speaker speech overlapping scenarios, this research investigates the speech separation model and incorporate with a voiceprint recognition task. This paper proposes a feature-scale single channel speech separation network connected to a back-end speaker verification network with MFCCT features, so the accuracy of speaker identification indicates the quality of speech separation task. The datasets are prepared by synthesizing Voxceleb1 data, and used for training and testing. The results show that using an objective downstream evaluation can effectively improve the overall performance, as the optimized speech separation model significantly reduced the error rate of speaker verification.

Keywords