Majlesi Journal of Electrical Engineering (Jul 2025)
Towards Omni-Font Optical Character Recognition (OCR) for Persian script using the YOLO object detection model
Abstract
Optical Character Recognition (OCR), especially for scripts with complex structures like Persian script, faces significant challenges in interpreting nuanced characters and contextual variations. This study provides a straightforward and scalable approach to developing omni-font OCR systems. A synthetic dataset incorporating words, numbers, punctuation marks, mathematic symbols, and whitespace characters is developed to evaluate YOLO's capability in detecting 70 characters across 15 formal, informal, and handwritten-style fonts. The proposed method for detection of regular space and non-breaking space characters achieved high-precision results that may eliminate the need for a separate word detection stage in an OCR system. In another investigation, we attempted to detect characters from an unseen font by training the model on a batch of other fonts. A formal font such as “B Nazanin” is near-completely detectable without being directly included in the model’s training with a batch of fourteen other fonts. For a handwritten-style font such as “MRT_Sayeh-1”, the mean detectability increases from 54%, when training the model with other single fonts, to 80% when a batch of fourteen other fonts is used. Overall, this study demonstrates that object detection-based OCR models have the potential for Omni-Font text recognition through expanded datasets and advancements in deep learning.
Keywords