The Performance of Wearable Speech Enhancement System Under Noisy Environment: An Experimental Study

Pavani Cherukuru; Mumtaz Begum Mustafa; Hema Subramaniam

doi:10.1109/ACCESS.2021.3137878

IEEE Access (Jan 2022)

The Performance of Wearable Speech Enhancement System Under Noisy Environment: An Experimental Study

Pavani Cherukuru,
Mumtaz Begum Mustafa,
Hema Subramaniam

Affiliations

Pavani Cherukuru: Department of Software Engineering, Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia
Mumtaz Begum Mustafa: ORCiD; Department of Software Engineering, Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia
Hema Subramaniam: ORCiD; Department of Software Engineering, Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia

DOI: https://doi.org/10.1109/ACCESS.2021.3137878
Journal volume & issue: Vol. 10
pp. 5647 – 5659

Abstract

Read online

Wearable speech enhancement can improve the recognition accuracy of the speech signals in stationary noise environments at 0dB to 60dB signal to noise ratio. Beamforming, adaptive noise reduction, and voice activity detection algorithms are used in wearable speech enhancement systems to enhance speech signals. In recent works, a word rate recognition accuracy of 63% for a 0db signal-to-noise ratio is not satisfactory for a robust speech recognition system. This paper discusses the experimental study using fixed beamforming, adaptive noise reduction, and voice activity detection algorithms with the inclusion of −10dB to 20dB signal to noise ratio for different types of noises to test the wearable speech enhancement system’s performance in noisy environments. It also compares deep learning-based noise reduction methods as a benchmark for speech enhancement and word recognition for different noise levels. We have obtained an average word rate recognition accuracy of 5.74% at −10dB and 93.79% at 20dB for non-stationary noisy environments. The outcome of the experiments shows that the selected methods perform significantly better in the environment with high noise dB for both stationary and non-stationary noise. We found that there is no significant statistical difference between the stationary and non-stationary noise word recognition and SNRs level. However, the deep learning-based method performs significantly better than the fixed beamforming, adaptive noise reduction, and voice activity detection algorithms in all noisy levels.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords