Speech Enhancement Based on Two-Stage Processing with Deep Neural Network for Laser Doppler Vibrometer

Chengkai Cai; Kenta Iwai; Takanobu Nishiura

doi:10.3390/app13031958

Applied Sciences (Feb 2023)

Speech Enhancement Based on Two-Stage Processing with Deep Neural Network for Laser Doppler Vibrometer

Chengkai Cai,
Kenta Iwai,
Takanobu Nishiura

Affiliations

Chengkai Cai: Graduate School of Information Science and Engineering, Ritsumeikan University, Kyoto 603-8577, Japan
Kenta Iwai: College of Information Science and Engineering, Ritsumeikan University, Kyoto 603-8577, Japan
Takanobu Nishiura: College of Information Science and Engineering, Ritsumeikan University, Kyoto 603-8577, Japan

DOI: https://doi.org/10.3390/app13031958
Journal volume & issue: Vol. 13, no. 3
p. 1958

Abstract

Read online

The development of distant-talk measurement systems has been attracting attention since they can be applied to many situations such as security and disaster relief. One such system that uses a device called a laser Doppler vibrometer (LDV) to acquire sound by measuring an object’s vibration caused by the sound source has been proposed. Different from traditional microphones, an LDV can pick up the target sound from a distance even in a noisy environment. However, the acquired sounds are greatly distorted due to the object’s shape and frequency response. Due to the particularity of the degradation of observed speech, conventional methods cannot be effectively applied to LDVs. We propose two speech enhancement methods that are based on two-stage processing with deep neural networks for LDVs. With the first proposed method, the amplitude spectrum of the observed speech is first restored. The phase difference between the observed and clean speech is then estimated using the restored amplitude spectrum. With the other proposed method, the low-frequency components of the observed speech are first restored. The high-frequency components are then estimated by the restored low-frequency components. The evaluation results indicate that they improved the observed speech in sound quality, deterioration degree, and intelligibility.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords