Blind Source Separation Based on Improved Wave-U-Net Network

Chaofeng Lan; Jingjuan Jiang; Lei Zhang; Zhen Zeng

doi:10.1109/ACCESS.2023.3330160

IEEE Access (Jan 2023)

Blind Source Separation Based on Improved Wave-U-Net Network

Chaofeng Lan,
Jingjuan Jiang,
Lei Zhang,
Zhen Zeng

Affiliations

Chaofeng Lan: ORCiD; School of Measurement and Communication Engineering, Harbin University of Science and Technology, Harbin, China
Jingjuan Jiang: School of Measurement and Communication Engineering, Harbin University of Science and Technology, Harbin, China
Lei Zhang: ORCiD; Beidahuang Industry Group General Hospital, Harbin, China
Zhen Zeng: Teaching Department, Heilongjiang Open University, Harbin, China

DOI: https://doi.org/10.1109/ACCESS.2023.3330160
Journal volume & issue: Vol. 11
pp. 125951 – 125958

Abstract

Read online

With the development and widespread application of voice interaction technology, it has become crucial to improve the accuracy of blind source separation technology. In order to further enhance the separation results of vocal and accompaniment, this paper proposes an improved Wave-U-Net model. Based on the skip connection of the Wave-U-Net model, we propose a segmented attention module (SAM) consisting of a spatial attention module (SPAM) and a channel attention module (CAM) to replace the skip connections in this model to solve the semantic gap caused by feature concatenation. Furthermore, we replace the 1D convolution layer of the bottleneck layer in this model with an atrous spatial pyramid pooling (ASPP) module. The purpose is to increase the receptive field and obtain multi-scale features at the same time, thereby improving the speech separation performance of the model. We conduct experimental tests in the Musdb18 dataset, and analyze the performance of the model using the SDR, SIR and SAR evaluation indicators. The research results denote that compared with the Wave-U-Net network that only uses feature concatenation, the SDR values of the restored vocal and restored accompaniment are increased by 4.229dB and 4.626dB, respectively, and the separation performance is better than some existing baseline models.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords