Healthcare Technology Letters (May 2017)

Speech reconstruction using a deep partially supervised neural network

  • Ian McLoughlin,
  • Jingjie Li,
  • Jingjie Li,
  • Yan Song,
  • Hamid R. Sharifzadeh

DOI
https://doi.org/10.1049/htl.2016.0103

Abstract

Read online

Statistical speech reconstruction for larynx-related dysphonia has achieved good performance using Gaussian mixture models and, more recently, restricted Boltzmann machine arrays; however, deep neural network (DNN)-based systems have been hampered by the limited amount of training data available from individual voice-loss patients. The authors propose a novel DNN structure that allows a partially supervised training approach on spectral features from smaller data sets, yielding very good results compared with the current state-of-the-art.

Keywords