Enhancing Local Dependencies for Transformer-Based Text-to-Speech via Hybrid Lightweight Convolution

Wei Zhao; Ting He; Li Xu

doi:10.1109/ACCESS.2021.3065736

IEEE Access (Jan 2021)

Enhancing Local Dependencies for Transformer-Based Text-to-Speech via Hybrid Lightweight Convolution

Wei Zhao,
Ting He,
Li Xu

Affiliations

Wei Zhao: ORCiD; College of Electrical Engineering, Zhejiang University, Hangzhou, China
Ting He: ORCiD; College of Electrical Engineering, Zhejiang University, Hangzhou, China
Li Xu: ORCiD; College of Electrical Engineering, Zhejiang University, Hangzhou, China

DOI: https://doi.org/10.1109/ACCESS.2021.3065736
Journal volume & issue: Vol. 9
pp. 42762 – 42770

Abstract

Read online

Owing to the powerful self-attention mechanism, the Transformer network has achieved considerable successes across many sequence modeling tasks and has become one of the most popular methods in text-to-speech (TTS). The vanilla self-attention excels in capturing long-range dependencies but suffers in modeling stable short-range dependencies that are quite important for speech synthesis where the local audio signals are highly correlated. To address this problem, we propose the hybrid lightweight convolution (HLC), which is responsible for fully exploiting local structures of a sequence, and combine it with the self-attention to improve the Transformer-based TTS. The experimental results show that our modified model obtains better performance in both objective and subjective evaluations. At the same time, we also demonstrate that a more compact TTS model may be built through the combination of self-attention and proposed hybrid lightweight convolution. Besides, this method is also potentially adaptable for other sequence modeling tasks.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords