IEEE Access (Jan 2024)

Handwritten Amharic Word Recognition With Additive Attention Mechanism

  • Ruchika Malhotra,
  • Maru Tesfaye Addis

DOI
https://doi.org/10.1109/access.2024.3444897
Journal volume & issue
Vol. 12
pp. 114645 – 114657

Abstract

Read online

Amharic stands as the second most widely spoken Semitic language globally, trailing only Arabic. As a result, it is a highly resourceful language for hardcopy-document collection, and automatic recognition technology is required. Recognizing handwritten Amharic words is a difficult task because of factors such as variations in individual handwriting for the same words, connectivity of two words without spaces, similarities in the shapes of alphabet characters, and noise from scanned images. The Amharic alphabet has a vast character set that adapted most of its characters from the most popular script called Ethiopic script, formerly known as Ge’ez script, and added unique characters. Despite significant advancements in optical character recognition (OCR) research, Amharic script recognition has received less attention. This study employs a deep learning (DL) approach with additive attention to recurrent neural networks (RNN) to achieve precise recognition of handwritten Amharic words. Seven convolutional neural networks (CNN) and two RNN, using a connectionist temporal classification (CTC) strategy, make up this recognition model architecture, enabling efficient recognition through sequential feature extraction. The study addressed deep learning data insufficiency by using augmentation techniques to increase datasets. The study used an original dataset of 12,047 handwritten Amharic words and an augmented dataset of 22,00 images. The developed model achieved an average character error rate (CER) of 2.84% and an average word error rate (WER) of 9.75% for the testing dataset. These results are indeed promising, offering a glimpse into the potential of this attention-based approach for handwritten Amharic word recognition. This research represents a significant step toward bridging the gap in OCR technology for Amharic script and showcases the transformative capabilities of DL in pattern recognition.

Keywords