Iranian Journal of Electrical and Electronic Engineering (Jun 2019)

A Novel Singing Voice Separation Method Based on Sparse Non-Negative Matrix Factorization and Low-Rank Modeling

  • S. Mavaddati

Journal volume & issue
Vol. 15, no. 2
pp. 161 – 171

Abstract

Read online

A new single channel singing voice separation algorithm is presented in this paper. This field of signal processing provides important capability in various areas dealing with singer identification, voice recognition, data retrieval. This separation procedure is done using a decomposition model based on the spectrogram of singing voice signals. The novelty of the proposed separation algorithm is related to different issues listed in the following: 1) The decomposition scheme employs the vocal and music models learned using sparse non-negative matrix factorization algorithm. The vocal signal and music accompaniment can be considered as sparse and low-rank components of a singing voice segment, respectively. 2) An alternating factorization algorithm is used to decompose input data based on the modeled structures of the vocal and musical components. 3) A voice activity detection algorithm is introduced based on the energy of coding coefficients matrix in the training step to learn the basis vectors that are related to instrumental parts. 4) In the separation phase, these non-vocal atoms are updated to the new test conditions using the domain transfer approach to result in a proper separation procedure with low reconstruction error. The performance evaluation of the proposed algorithm is done using different measures and leads to significantly better results in comparison with the earlier methods in this context and the traditional procedures. The average improvement values of the proposed separation algorithm for PESQ, fwSegSNR, SDI, and GNSDR measures in comparison with previous separation methods in two defined test scenario and three mentioned SMR levels are 0.53, 0.84, 0.39, and 2.19, respectively.

Keywords