Speech reconstruction using a deep partially supervised neural network

Ian McLoughlin; Jingjie Li; Jingjie Li; Yan Song; Hamid R. Sharifzadeh

doi:10.1049/htl.2016.0103

Healthcare Technology Letters (May 2017)

Speech reconstruction using a deep partially supervised neural network

Ian McLoughlin,
Jingjie Li,
Jingjie Li,
Yan Song,
Hamid R. Sharifzadeh

Affiliations

Ian McLoughlin: The University of Kent
Jingjie Li: The University of Science and Technology of China
Jingjie Li: The University of Science and Technology of China
Yan Song: The University of Science and Technology of China
Hamid R. Sharifzadeh: Unitec Institute of Technology

DOI: https://doi.org/10.1049/htl.2016.0103

Abstract

Read online

Statistical speech reconstruction for larynx-related dysphonia has achieved good performance using Gaussian mixture models and, more recently, restricted Boltzmann machine arrays; however, deep neural network (DNN)-based systems have been hampered by the limited amount of training data available from individual voice-loss patients. The authors propose a novel DNN structure that allows a partially supervised training approach on spectral features from smaller data sets, yielding very good results compared with the current state-of-the-art.

Published in Healthcare Technology Letters

ISSN: 2053-3713 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Medical technology
Website: https://ietresearch.onlinelibrary.wiley.com/journal/20533713

About the journal

Abstract

Keywords