Mathematics (Jul 2024)
End-to-End Deep Learning Framework for Arabic Handwritten Legal Amount Recognition and Digital Courtesy Conversion
Abstract
Arabic handwriting recognition and conversion are crucial for financial operations, particularly for processing handwritten amounts on cheques and financial documents. Compared to other languages, research in this area is relatively limited, especially concerning Arabic. This study introduces an innovative AI-driven method for simultaneously recognizing and converting Arabic handwritten legal amounts into numerical courtesy forms. The framework consists of four key stages. First, a new dataset of Arabic legal amounts in handwritten form (“.png” image format) is collected and labeled by natives. Second, a YOLO-based AI detector extracts individual legal amount words from the entire input sentence images. Third, a robust hybrid classification model is developed, sequentially combining ensemble Convolutional Neural Networks (CNNs) with a Vision Transformer (ViT) to improve the prediction accuracy of single Arabic words. Finally, a novel conversion algorithm transforms the predicted Arabic legal amounts into digital courtesy forms. The framework’s performance is fine-tuned and assessed using 5-fold cross-validation tests on the proposed novel dataset, achieving a word level detection accuracy of 98.6% and a recognition accuracy of 99.02% at the classification stage. The conversion process yields an overall accuracy of 90%, with an inference time of 4.5 s per sentence image. These results demonstrate promising potential for practical implementation in diverse Arabic financial systems.
Keywords