Acoustic Modeling Based on Deep Learning for Low-Resource Speech Recognition: An Overview

Chongchong Yu; Meng Kang; Yunbing Chen; Jiajia Wu; Xia Zhao

doi:10.1109/ACCESS.2020.3020421

IEEE Access (Jan 2020)

Acoustic Modeling Based on Deep Learning for Low-Resource Speech Recognition: An Overview

Chongchong Yu,
Meng Kang,
Yunbing Chen,
Jiajia Wu,
Xia Zhao

Affiliations

Chongchong Yu: Key Laboratory of Industrial Internet and Big Data, Beijing Technology and Business University, Beijing, China
Meng Kang: ORCiD; Key Laboratory of Industrial Internet and Big Data, Beijing Technology and Business University, Beijing, China
Yunbing Chen: Putian Information Technology Company, Ltd., Beijing, China
Jiajia Wu: Key Laboratory of Industrial Internet and Big Data, Beijing Technology and Business University, Beijing, China
Xia Zhao: Key Laboratory of Industrial Internet and Big Data, Beijing Technology and Business University, Beijing, China

DOI: https://doi.org/10.1109/ACCESS.2020.3020421
Journal volume & issue: Vol. 8
pp. 163829 – 163843

Abstract

Read online

The polarization of world languages is becoming more and more obvious. Many languages, mainly endangered languages, are of low-resource attribute due to lack of information. Both language conservation and cultural heritage face important challenges. Therefore, speech recognition for low- resource scenario has become a hot topic in the field of speech. Based on the complex network structures and huge model parameters, deep learning has become a powerful science in the process of speech recognition, which has a broad and far-reaching significance for the study of low-resource speech recognition. Aiming at the characteristic of low resource, this article reviews the history and research status of two kinds of acoustic models of deep learning neural networks and acoustic end-to-end structures. We further elaborate on several key techniques for improving performance in the two aspects of data and model training. There are two projects for low-resource languages introduced in this article. The possible future developments are finally pointed out. These works provide some reference for computer speech and language processing.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords