PeerJ Computer Science (Sep 2022)

S-Swin Transformer: simplified Swin Transformer model for offline handwritten Chinese character recognition

  • Yongping Dan,
  • Zongnan Zhu,
  • Weishou Jin,
  • Zhuo Li

DOI
https://doi.org/10.7717/peerj-cs.1093
Journal volume & issue
Vol. 8
p. e1093

Abstract

Read online Read online

The Transformer shows good prospects in computer vision. However, the Swin Transformer model has the disadvantage of a large number of parameters and high computational effort. To effectively solve these problems of the model, a simplified Swin Transformer (S-Swin Transformer) model was proposed in this article for handwritten Chinese character recognition. The model simplifies the initial four hierarchical stages into three hierarchical stages. In addition, the new model increases the size of the window in the window attention; the number of patches in the window is larger; and the perceptual field of the window is increased. As the network model deepens, the size of patches becomes larger, and the perceived range of each patch increases. Meanwhile, the purpose of shifting the window’s attention is to enhance the information interaction between the window and the window. Experimental results show that the verification accuracy improves slightly as the window becomes larger. The best validation accuracy of the simplified Swin Transformer model on the dataset reached 95.70%. The number of parameters is only 8.69 million, and FLOPs are 2.90G, which greatly reduces the number of parameters and computation of the model and proves the correctness and validity of the proposed model.

Keywords