Improvement on sound quality of optical-body-conducted speech using convolutional neural network

Daiki KAWAMOTO; Masashi NAKAYAMA

doi:10.1299/transjsme.23-00304

Nihon Kikai Gakkai ronbunshu (May 2024)

Improvement on sound quality of optical-body-conducted speech using convolutional neural network

Daiki KAWAMOTO,
Masashi NAKAYAMA

Affiliations

Daiki KAWAMOTO: Graduate School of Information and Sciences, Hiroshima City University
Masashi NAKAYAMA: Graduate School of Information and Sciences, Hiroshima City University

DOI: https://doi.org/10.1299/transjsme.23-00304
Journal volume & issue: Vol. 90, no. 934
pp. 23-00304 – 23-00304

Abstract

Read online

Magnetic Resonance Imaging (MRI) is used in the medical field, and capable of precise examinations such as cancer detection. While MRI has the benefit of no radiation exposure compared to CT, it can cause noise sound from the magnetic field coils and stress on a patient due to the time required for the examination. Additionally, it is difficult for the physician and patient to communicate using a conventional microphone because the magnetic field in the MRI room prevents the use of devices that contain magnetic materials. In this study, we aim to achieve communication in a high magnetic field and high noise environment such as an MRI room, and then propose a communication method using an optical-body-conducted speech microphone. Optical-body-conducted speech is a speech that is conducted on a human body obtained by a contact-type optical microphone. It has noise robust characteristics against airborne noise. It is similar to conventional body-conducted speech, so it has low-quality sound because it is attenuated at 2 kHz and higher. In particular, optical microphones can be used in strong magnetic fields because they are made of non-magnetic materials. Therefore, we propose a sound retrieval method for optical-body-conducted speech which is noise suppression and sound quality improvement of the optical-body-conducted speech using a convolutional neural network to obtain clearer and more natural speech from optical-body-conducted speech. We confirmed improvement in the sound quality of the optical-body-conducted speech by objective evaluation using MCD with CNN-WaveGlow transformation for sound quality improvement.

Published in Nihon Kikai Gakkai ronbunshu

ISSN: 2187-9761 (Online)
Publisher: The Japan Society of Mechanical Engineers
Country of publisher: Japan
LCC subjects: Technology: Mechanical engineering and machinery; Technology: Engineering (General). Civil engineering (General): Engineering machinery, tools, and implements
Website: https://www.jsme.or.jp/publish/transact/

About the journal

Abstract

Keywords