Applied Sciences (Aug 2023)
Cascade Speech Translation for the Kazakh Language
Abstract
Speech translation systems have become indispensable in facilitating seamless communication across language barriers. This paper presents a cascade speech translation system tailored specifically for translating speech from the Kazakh language to Russian. The system aims to enable effective cross-lingual communication between Kazakh and Russian speakers, addressing the unique challenges posed by these languages. To develop the cascade speech translation system, we first created a dedicated speech translation dataset ST-kk-ru based on the ISSAI Corpus. The ST-kk-ru dataset comprises a large collection of Kazakh speech recordings along with their corresponding Russian translations. The automatic speech recognition (ASR) module of the system utilizes deep learning techniques to convert spoken Kazakh input into text. The machine translation (MT) module employs state-of-the-art neural machine translation methods, leveraging the parallel Kazakh-Russian translations available in the dataset to generate accurate translations. By conducting extensive experiments and evaluations, we have thoroughly assessed the performance of the cascade speech translation system on the ST-kk-ru dataset. The outcomes of our evaluation highlight the effectiveness of incorporating additional datasets for both the ASR and MT modules. This augmentation leads to a significant improvement in the performance of the cascade speech translation system, increasing the BLEU score by approximately 2 points when translating from Kazakh to Russian. These findings underscore the importance of leveraging supplementary data to enhance the capabilities of speech translation systems.
Keywords