An Improved Chinese Pause Fillers Prediction Module Based on RoBERTa

Ling Yu; Xiaoqun Zhou; Fanglin Niu

doi:10.3390/app131910652

Applied Sciences (Sep 2023)

An Improved Chinese Pause Fillers Prediction Module Based on RoBERTa

Ling Yu,
Xiaoqun Zhou,
Fanglin Niu

Affiliations

Ling Yu: School of Electronics and Information Engineering, Liaoning University of Technology, Jinzhou 121001, China
Xiaoqun Zhou: School of Electronics and Information Engineering, Shenyang University of Technology, Shenyang 110000, China
Fanglin Niu: School of Electronics and Information Engineering, Liaoning University of Technology, Jinzhou 121001, China

DOI: https://doi.org/10.3390/app131910652
Journal volume & issue: Vol. 13, no. 19
p. 10652

Abstract

Read online

The prediction of pause fillers plays a crucial role in enhancing the naturalness of synthesized speech. In recent years, neural networks, including LSTM, BERT, and XLNet, have been employed for pause fillers prediction modules. However, these methods have exhibited relatively lower accuracy in predicting pause fillers. This paper introduces the utilization of the RoBERTa model for predicting Chinese pause fillers and presents a novel approach to training the RoBERTa model, effectively enhancing the accuracy of Chinese pause fillers prediction. Our proposed approach involves categorizing text from different speakers into four distinct style groups based on the frequency and position of Chinese pause fillers. The RoBERTa model is trained on these four groups of data, which incorporate different styles of fillers, thereby ensuring a more natural synthesis of speech. The Chinese pause fillers prediction module is evaluated on systems such as Parallel Tacotron2, FastPitch, and Deep Voice3, achieving a notable 26.7% improvement in word-level prediction accuracy compared to the BERT model, along with a 14% enhancement in position-level prediction accuracy. This substantial improvement results in a significant enhancement of the naturalness of the generated speech.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords