Speech Segregation in Background Noise Based on Deep Learning

Joseph Bamidele Awotunde; Roseline Oluwaseun Ogundokun; Femi Emmanuel Ayo; Opeyemi Emmanuel Matiluko

doi:10.1109/ACCESS.2020.3024077

IEEE Access (Jan 2020)

Speech Segregation in Background Noise Based on Deep Learning

Joseph Bamidele Awotunde,
Roseline Oluwaseun Ogundokun,
Femi Emmanuel Ayo,
Opeyemi Emmanuel Matiluko

Affiliations

Joseph Bamidele Awotunde: ORCiD; Department of Computer Science, University of Ilorin, Ilorin, Nigeria
Roseline Oluwaseun Ogundokun: ORCiD; Department of Computer Science, Landmark University, Omu Aran, Nigeria
Femi Emmanuel Ayo: ORCiD; Department of Physical and Computer Sciences, McPherson University, Seriki Sotayo, Nigeria
Opeyemi Emmanuel Matiluko: Center for System and Information Services, Landmark University, Omu Aran, Nigeria

DOI: https://doi.org/10.1109/ACCESS.2020.3024077
Journal volume & issue: Vol. 8
pp. 169568 – 169575

Abstract

Read online

The most important way several people communicate is through speech. Speech is used to convey other information such as speaker communication, emotion, and attitude. Therefore, it is the most convenient and natural means of communication. The concept of speech segregation or processing involves sorting out wanted speech from noises in the background. Recently, a supervised learning approach was formulated for speech segregation problems. The latest trend in speech processing comprises the utilization of deep learning systems to increase the computational speed and performance of speech processing tasks. Hence, this study employed the use of a convolutional neural network to segregate speech in background noise. The convolutional neural network was used to explain the features of presenter auditory and consecutive subtleties. An unadapted speaker model was originally utilized to separate the two vocalizations gestures; they were then applied to the assessed signal-to-noise ratio (SNR) participation. The participation of SNR was thereafter applied to modify the speaker prototypes for re-estimating the speech signals that iterated twice before convergence. The developed method was tested on the TIMIT dataset. The results showed the strength of the developed method for speech segregation in background noise. Also, the findings of the study suggested that the method enhanced isolation performance and congregated reasonably fast. It was deduced that the system is simple and performs better in comparison to ultramodern speech processing methods in some input SNR conditions.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords