Persian Optical Character Recognition Using Deep Bidirectional Long Short-Term Memory

Zohreh Khosrobeigi; Hadi Veisi; Ehsan Hoseinzade; Hanieh Shabanian

doi:10.3390/app122211760

Applied Sciences (Nov 2022)

Persian Optical Character Recognition Using Deep Bidirectional Long Short-Term Memory

Zohreh Khosrobeigi,
Hadi Veisi,
Ehsan Hoseinzade,
Hanieh Shabanian

Affiliations

Zohreh Khosrobeigi: School of Computer Science and Statistics, Trinity College Dublin, D02 YY50 Dublin, Ireland
Hadi Veisi: Faculty of New Sciences and Technologies, University of Tehran, Tehran P.O. Box 14399-56191, Iran
Ehsan Hoseinzade: School of Computing Science, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
Hanieh Shabanian: Computer Science Department, School of Computing and Analytics, Northern Kentucky University, Highland Heights, KY 41076, USA

DOI: https://doi.org/10.3390/app122211760
Journal volume & issue: Vol. 12, no. 22
p. 11760

Abstract

Read online

Optical Character Recognition (OCR) is a system of converting images, including text,into editable text and is applied to various languages such as English, Arabic, and Persian. While these languages have similarities, their fundamental differences can create unique challenges. In Persian, continuity between Characters, the existence of semicircles, dots, oblique, and left-to-right characters such as English words in the context are some of the most important challenges in designing Persian OCR systems. Our proposed framework, Bina, is designed in a special way to address the issue of continuity by utilizing Convolution Neural Network (CNN) and deep bidirectional Long-Short Term Memory (BLSTM), a type of LSTM networks that has access to both past and future context. A huge and diverse dataset, including about 2M samples of both Persian and English contexts,consisting of various fonts and sizes, is also generated to train and test the performance of the proposed model. Various configurations are tested to find the optimal structure of CNN and BLSTM. The results show that Bina successfully outperformed state of the art baseline algorithm by achieving about 96% accuracy in the Persian and 88% accuracy in the Persian and English contexts.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords