IEEE Access (Jan 2021)

LMC-SMCA: A New Active Learning Method in ASR

  • Xiusong Sun,
  • Bo Wang,
  • Shaohan Liu,
  • Tingxiang Lu,
  • Xin Shan,
  • Qun Yang

DOI
https://doi.org/10.1109/ACCESS.2021.3062157
Journal volume & issue
Vol. 9
pp. 37011 – 37021

Abstract

Read online

In Automatic Speech Recognition (ASR), transcribed data take substantial effort to obtain. It is worthwhile to explore how to selective the samples with more information from un-transcribed datapool to get a better model with the limited cost. Therefore, active learning in ASR becomes a research topic. In this manuscript, we proposed two new methods of active learning. One is Signal-Model Committee Approach (SMCA) and the other is LM-based Certainty Approach (LMCA). These two methods respectively evaluate the information amount of samples from different angles and can be applied together for joint sampling in some scenarios. We conducted many comparative experiments on Listen, Attend and Spell (LAS) model according to different demands. In experiments, we compared our approach with the random sampling and another state-of-the-art committee-based approach: heterogeneous neural networks (HNN) based approach. We examined our approach in CER in Chinese Mandarin speech recognition task. The results show that proposed approach is not only simple to use, but also has the best performance.

Keywords