International Journal of Mathematical, Engineering and Management Sciences (Oct 2022)
CNN-Based Optical Character Recognition for Isolated Printed Gujarati Characters and Handwritten Numerals
Abstract
Optical character recognition (OCR) technologies have made significant progress in the field of language recognition. Gujarati is a more difficult language to recognize compared to other languages because of curves, close loops, the inclusion of modifiers, and the presence of joint characters. So great effort has been laid into the literature for Gujarati OCR. Recently deep learning-based CNN models are applied to develop OCR for different languages but Convolutional Neural Networks (CNN) models are not yet giving a satisfactory performance to recognize Gujarati characters. So, this paper proposes a revolutionary Gujarati printed characters and numerals recognition CNN models. CNN-PGC (CNN for - Printed Gujarati Character) and CNN-HGC (CNN for - Handwritten Gujarati Character) are two optimally configured Convolutional Neural Networks (CNNs) presented in this research for printed Gujarati base characters and handwritten numbers, respectively. Concerning particular performance indicators, the suggested work's performance is evaluated and proven against that of other traditional models and with the latest baseline methods. Experimental analysis has been carried out on well-segmented newly generated Gujarati base characters and numerals dataset which includes 36 consonants, 13 vowels, and 10 handwritten numerals. Variation in the database is also taken into consideration during experiments like size, skew, noise blue, etc. Even in the presence of printing irregularities, writing irregularities, and degradations the proposed method achieves a 98.08% recognition rate for print characters and a 95.24 % recognition rate for handwritten numerals which is better than other existing models.
Keywords