Voice Quality Modelling for Expressive Speech Synthesis

Carlos Monzo; Ignasi Iriondo; Joan Claudi Socoró

doi:10.1155/2014/627189

The Scientific World Journal (Jan 2014)

Voice Quality Modelling for Expressive Speech Synthesis

Carlos Monzo,
Ignasi Iriondo,
Joan Claudi Socoró

Affiliations

Carlos Monzo: Computer Science, Multimedia and Telecommunication Studies, Universitat Oberta de Catalunya (UOC), Rambla del Poblenou 156, 08018 Barcelona, Spain
Ignasi Iriondo: Grup de Recerca en Tecnologies Mèdia (GTM), Universitat Ramon Llull, La Salle, Quatre Camins 2, 08022 Barcelona, Spain
Joan Claudi Socoró: Grup de Recerca en Tecnologies Mèdia (GTM), Universitat Ramon Llull, La Salle, Quatre Camins 2, 08022 Barcelona, Spain

DOI: https://doi.org/10.1155/2014/627189
Journal volume & issue: Vol. 2014

Abstract

Read online

This paper presents the perceptual experiments that were carried out in order to validate the methodology of transforming expressive speech styles using voice quality (VoQ) parameters modelling, along with the well-known prosody (F0, duration, and energy), from a neutral style into a number of expressive ones. The main goal was to validate the usefulness of VoQ in the enhancement of expressive synthetic speech in terms of speech quality and style identification. A harmonic plus noise model (HNM) was used to modify VoQ and prosodic parameters that were extracted from an expressive speech corpus. Perception test results indicated the improvement of obtained expressive speech styles using VoQ modelling along with prosodic characteristics.

Published in The Scientific World Journal

ISSN: 2356-6140 (Print); 1537-744X (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Technology; Medicine; Science
Website: https://onlinelibrary.wiley.com/journal/8086

About the journal