Speech enhancement by LSTM-based noise suppression followed by CNN-based speech restoration

Maximilian Strake; Bruno Defraene; Kristoff Fluyt; Wouter Tirry; Tim Fingscheidt

doi:10.1186/s13634-020-00707-1

EURASIP Journal on Advances in Signal Processing (Dec 2020)

Speech enhancement by LSTM-based noise suppression followed by CNN-based speech restoration

Maximilian Strake,
Bruno Defraene,
Kristoff Fluyt,
Wouter Tirry,
Tim Fingscheidt

Affiliations

Maximilian Strake: Institute for Communications Technology, Technische Universität Braunschweig
Bruno Defraene: Goodix Technology Belgium BV
Kristoff Fluyt: Goodix Technology Belgium BV
Wouter Tirry: Goodix Technology Belgium BV
Tim Fingscheidt: Institute for Communications Technology, Technische Universität Braunschweig

DOI: https://doi.org/10.1186/s13634-020-00707-1
Journal volume & issue: Vol. 2020, no. 1
pp. 1 – 26

Abstract

Read online

Abstract Single-channel speech enhancement in highly non-stationary noise conditions is a very challenging task, especially when interfering speech is included in the noise. Deep learning-based approaches have notably improved the performance of speech enhancement algorithms under such conditions, but still introduce speech distortions if strong noise suppression shall be achieved. We propose to address this problem by using a two-stage approach, first performing noise suppression and subsequently restoring natural sounding speech, using specifically chosen neural network topologies and loss functions for each task. A mask-based long short-term memory (LSTM) network is employed for noise suppression and speech restoration is performed via spectral mapping with a convolutional encoder-decoder network (CED). The proposed method improves speech quality (PESQ) over state-of-the-art single-stage methods by about 0.1 points for unseen highly non-stationary noise types including interfering speech. Furthermore, it is able to increase intelligibility in low-SNR conditions and consistently outperforms all reference methods.

Published in EURASIP Journal on Advances in Signal Processing

ISSN: 1687-6172 (Print); 1687-6180 (Online)
Publisher: SpringerOpen
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Telecommunication; Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics
Website: https://asp-eurasipjournals.springeropen.com

About the journal

Abstract

Keywords