Two-Step CNN Framework for Text Line Recognition in Camera-Captured Images

Yulia S. Chernyshova; Alexander V. Sheshkus; Vladimir V. Arlazarov

doi:10.1109/ACCESS.2020.2974051

IEEE Access (Jan 2020)

Two-Step CNN Framework for Text Line Recognition in Camera-Captured Images

Yulia S. Chernyshova,
Alexander V. Sheshkus,
Vladimir V. Arlazarov

Affiliations

Yulia S. Chernyshova: ORCiD; Federal Research Center “Computer Science and Control” of RAS, Moscow, Russia
Alexander V. Sheshkus: ORCiD; Federal Research Center “Computer Science and Control” of RAS, Moscow, Russia
Vladimir V. Arlazarov: ORCiD; Institute for Information Transmission Problems (Kharkevich Institute) RAS, Moscow, Russia

DOI: https://doi.org/10.1109/ACCESS.2020.2974051
Journal volume & issue: Vol. 8
pp. 32587 – 32600

Abstract

Read online

In this paper, we introduce an “on the device” text line recognition framework that is designed for mobile or embedded systems. We consider per-character segmentation as a language-independent problem and individual character recognition as a language-dependent one. Thus, the proposed solution is based on two separate artificial neural networks (ANN) and dynamic programming instead of employing image processing methods for the segmentation step or end-to-end ANN. To satisfy the tight constraints on memory size imposed by embedded systems and to avoid overfitting, we employ ANNs with a small number of trainable parameters. The primary purpose of our framework is the recognition of low-quality images of identity documents with complex backgrounds and a variety of languages and fonts. We demonstrate that our solution shows high recognition accuracy on natural datasets even being trained on purely synthetic data. We use MIDV-500 and Census 1961 Project datasets for text line recognition. The proposed method considerably surpasses the algorithmic method implemented in Tesseract 3.05, the LSTM method (Tesseract 4.00), and unpublished method used in the ABBYY FineReader 15 system. Also, our framework is faster than other compared solutions. We show the language-independence of our segmenter with the experiment with Cyrillic, Armenian, and Chinese text lines.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords