Towards multilingual end‐to‐end speech recognition for air traffic control

Yi Lin; Bo Yang; Dongyue Guo; Peng Fan

doi:10.1049/itr2.12094

IET Intelligent Transport Systems (Sep 2021)

Towards multilingual end‐to‐end speech recognition for air traffic control

Yi Lin,
Bo Yang,
Dongyue Guo,
Peng Fan

Affiliations

Yi Lin: College of Computer Science Sichuan University Chengdu Sichuan China
Bo Yang: College of Computer Science Sichuan University Chengdu Sichuan China
Dongyue Guo: College of Computer Science Sichuan University Chengdu Sichuan China
Peng Fan: College of Computer Science Sichuan University Chengdu Sichuan China

DOI: https://doi.org/10.1049/itr2.12094
Journal volume & issue: Vol. 15, no. 9
pp. 1203 – 1214

Abstract

Read online

Abstract In this work, an end‐to‐end framework is proposed to achieve multilingual automatic speech recognition (ASR) in air traffic control (ATC) systems. Considering the standard ATC procedure, a recurrent neural network (RNN) based framework is selected to mine the temporal dependencies among speech frames. Facing the distributed feature space caused by the radio transmission, a hybrid feature embedding block is designed to extract high‐level representations, in which multiple convolutional neural networks are designed to accommodate different frequency and temporal resolutions. The residual mechanism is performed on the RNN layers to improve the trainability and the convergence. To integrate the multilingual ASR into a single model and relieve the class imbalance, a special vocabulary is designed to unify the pronunciation of the vocabulary in Chinese and English, i.e., pronunciation‐oriented vocabulary. The proposed model is optimized by the connectionist temporal classification loss and is validated on a real‐world speech corpus (ATCSpeech). A character error rate of 4.4% and 5.9% is achieved for Chinese and English speech, respectively, which outperforms other popular approaches. Most importantly, the proposed approach achieves the multilingual ASR task in an end‐to‐end manner with considerable high performance.

Published in IET Intelligent Transport Systems

ISSN: 1751-956X (Print); 1751-9578 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Technology: Engineering (General). Civil engineering (General): Transportation engineering; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://ietresearch.onlinelibrary.wiley.com/journal/17519578

About the journal

Abstract

Keywords