Nihon Kikai Gakkai ronbunshu (May 2024)
Improvement on sound quality of optical-body-conducted speech using convolutional neural network
Abstract
Magnetic Resonance Imaging (MRI) is used in the medical field, and capable of precise examinations such as cancer detection. While MRI has the benefit of no radiation exposure compared to CT, it can cause noise sound from the magnetic field coils and stress on a patient due to the time required for the examination. Additionally, it is difficult for the physician and patient to communicate using a conventional microphone because the magnetic field in the MRI room prevents the use of devices that contain magnetic materials. In this study, we aim to achieve communication in a high magnetic field and high noise environment such as an MRI room, and then propose a communication method using an optical-body-conducted speech microphone. Optical-body-conducted speech is a speech that is conducted on a human body obtained by a contact-type optical microphone. It has noise robust characteristics against airborne noise. It is similar to conventional body-conducted speech, so it has low-quality sound because it is attenuated at 2 kHz and higher. In particular, optical microphones can be used in strong magnetic fields because they are made of non-magnetic materials. Therefore, we propose a sound retrieval method for optical-body-conducted speech which is noise suppression and sound quality improvement of the optical-body-conducted speech using a convolutional neural network to obtain clearer and more natural speech from optical-body-conducted speech. We confirmed improvement in the sound quality of the optical-body-conducted speech by objective evaluation using MCD with CNN-WaveGlow transformation for sound quality improvement.
Keywords