M2ASR-KIRGHIZ: A Free Kirghiz Speech Database and Accompanied Baselines

Ikram Mamtimin; Wenqiang Du; Askar Hamdulla

doi:10.3390/info14010055

Information (Jan 2023)

M2ASR-KIRGHIZ: A Free Kirghiz Speech Database and Accompanied Baselines

Ikram Mamtimin,
Wenqiang Du,
Askar Hamdulla

Affiliations

Ikram Mamtimin: School of Information Science and Engineering, Xinjiang University, Ürümqi 830017, China
Wenqiang Du: Center for Speech and Language Technologies, BNRist, Tsinghua University, Beijing 100084, China
Askar Hamdulla: School of Information Science and Engineering, Xinjiang University, Ürümqi 830017, China

DOI: https://doi.org/10.3390/info14010055
Journal volume & issue: Vol. 14, no. 1
p. 55

Abstract

Read online

Deep learning has significantly boosted the performance improvement of automatic speech recognition (ASR) with the cooperation of large amounts of data resources. For minority languages, however, there are almost no large-scale data resources, limiting the development of ASR technologies in these languages. In this paper, we publish a free Kirghiz speech database accompanied by associated language resources. The entire database involves 128 h of speech data from 163 speakers and corresponding transcriptions. To our knowledge, this is the largest Kirghiz speech database that is dedicated to the ASR task and is publicly free so far. In addition, we also provide several baseline systems based on Kaldi and WeNet to demonstrate how these public data resources can be used to facilitate the Kirghiz ASR research. This publication is a part of the M2ASR project, and all the resources can be downloaded at the project webpage.

Published in Information

ISSN: 2078-2489 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: http://www.mdpi.com/journal/information/

About the journal

Abstract

Keywords