Applied Sciences (Jan 2023)

Recognition and Classification of Handwritten Urdu Numerals Using Deep Learning Techniques

  • Aamna Bhatti,
  • Ameera Arif,
  • Waqar Khalid,
  • Baber Khan,
  • Ahmad Ali,
  • Shehzad Khalid,
  • Atiq ur Rehman

DOI
https://doi.org/10.3390/app13031624
Journal volume & issue
Vol. 13, no. 3
p. 1624

Abstract

Read online

Urdu is a complex language as it is an amalgam of many South Asian and East Asian languages; hence, its character recognition is a huge and difficult task. It is a bidirectional language with its numerals written from left to right while script is written in opposite direction which induces complexities in the recognition process. This paper presents the recognition and classification of a novel Urdu numeral dataset using convolutional neural network (CNN) and its variants. We propose custom CNN model to extract features which are used by Softmax activation function and support vector machine (SVM) classifier. We compare it with GoogLeNet and the residual network (ResNet) in terms of performance. Our proposed CNN gives an accuracy of 98.41% with the Softmax classifier and 99.0% with the SVM classifier. For GoogLeNet, we achieve an accuracy of 95.61% and 96.4% on ResNet. Moreover, we develop datasets for handwritten Urdu numbers and numbers of Pakistani currency to incorporate real-life problems. Our models achieve best accuracies as compared to previous models in the literature for optical character recognition (OCR).

Keywords