A survey of voice conversion based on non-parallel data

Pengcheng LI; Xulong ZHANG; Jianzong WANG; Ning CHENG; Jing XIAO

大数据 (May 2024)

A survey of voice conversion based on non-parallel data

Pengcheng LI,
Xulong ZHANG,
Jianzong WANG,
Ning CHENG,
Jing XIAO

Affiliations

Pengcheng LI
Xulong ZHANG
Jianzong WANG
Ning CHENG
Jing XIAO

Journal volume & issue: Vol. 10
pp. 65 – 81

Abstract

Read online

Voice conversion is a research topic in the fields of speech and artificial intelligence.The goal of voice conversion is to change the timbre of speech while preserving the content of the source speech, making it sounds like spoken by the target speaker.It is essential to ensure both the quality and naturalness of the converted speech.Voice conversion based on nonparallel data gains much attention currently, where models are trained using non-parallel multilingual speaker datasets, enabling many-to-many and any-to-any voice conversions.This paper provides a comprehensive summary and analysis of recent developments in non-parallel voice conversion.Firstly, we outline the early voice conversion techniques based on parallel corpus and their limitations.Then, we introduce and compare various approaches to voice conversion based on nonparallel data, providing a thorough analysis.Finally, a summary and outlook on voice conversion technology is provided.

voice conversion;artificial intelligence;deep learning

Published in 大数据

ISSN: 2096-0271 (Print)
Publisher: China InfoCom Media Group
Country of publisher: China
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://www.infocomm-journal.com/bdr/EN/2096-0271/home.shtml

About the journal

Abstract

Keywords