ELCVIA Electronic Letters on Computer Vision and Image Analysis (Jun 2024)

Off-line identifying Script Writers by Swin Transformers and ResNeSt-50

  • Afef Kacem Echi,
  • Takwa Ben Aïcha Gader

DOI
https://doi.org/10.5565/rev/elcvia.1787
Journal volume & issue
Vol. 23, no. 1

Abstract

Read online

In this work, we present two advanced models for identifying script writers, leveraging the power of deep learning. The proposed systems utilize the new vision Swin Transformer and ResNeSt-50. Swin Transformer is known for its robustness to variations and ability to model long-range dependencies, which helps capture context and make robust predictions. Through extensive training on large datasets of handwritten text samples, the Swin Transformer operates on sequences of image patches and learns to establish a robust representation of each writer’s unique style. On the other hand, ResNeSt-50 (Residual Neural Network with Squeeze-and-Excitation (SE) and Next Stage modules), with its multiple layers, helps in learning complex representations of a writer’s unique style and distinguishing between different writing styles with high precision. The SE module within ResNeSt helps the model focus on distinctive handwriting characteristics and reduce noise. The experimental results demonstrate exceptional performance, achieving an accuracy of 98.50% (at patch level) by the Swin Transformer on the CVL database, which consists of images with cursively handwritten German and English texts, and an accuracy of 96.61% (at page level) by ResNeSt-50 on the same database. This research advances writer identification by showcasing the effectiveness of the Swin Transformer and ResNeSt-50. The achieved accuracy underscores the potential of these models to process and understand complex handwriting effectively.

Keywords