IEEE Access (Jan 2019)

A Hardware-Oriented and Memory-Efficient Method for CTC Decoding

  • Siyuan Lu,
  • Jinming Lu,
  • Jun Lin,
  • Zhongfeng Wang

DOI
https://doi.org/10.1109/ACCESS.2019.2937680
Journal volume & issue
Vol. 7
pp. 120681 – 120694

Abstract

Read online

The Connectionist Temporal Classification (CTC) has achieved great success in sequence to sequence analysis tasks such as automatic speech recognition (ASR) and scene text recognition (STR). These applications can use the CTC objective function to train the recurrent neural networks (RNNs), and decode the outputs of RNNs during inference. While hardware architectures for RNNs have been studied, hardware-based CTC-decoders are desired for high-speed CTC-based inference systems. This paper, for the first time, provides a low-complexity and memory-efficient approach to build a CTC-decoder based on the beam search decoding. Firstly, we improve the beam search decoding algorithm to save the storage space. Secondly, we compress a dictionary (reduced from 26.02MB to 1.12MB) and use it as the language model. Meanwhile, we further enable searching this dictionary to be trivial. Finally, a fixed-point CTC-decoder for an English ASR and an STR task using the proposed method is implemented with C++ language. It is shown that the proposed method has little precision loss compared with its floating-point counterpart. Our experiments show that the compression ratio of the storage required by the proposed beam search decoding algorithm are 29.49 (ASR) and 17.95 (STR).

Keywords