S-Swin Transformer: simplified Swin Transformer model for offline handwritten Chinese character recognition

Yongping Dan; Zongnan Zhu; Weishou Jin; Zhuo Li

doi:10.7717/peerj-cs.1093

PeerJ Computer Science (Sep 2022)

S-Swin Transformer: simplified Swin Transformer model for offline handwritten Chinese character recognition

Yongping Dan,
Zongnan Zhu,
Weishou Jin,
Zhuo Li

Affiliations

Yongping Dan: School of Electronic Information, Zhongyuan University of Technology, Zhengzhou, Henan, China
Zongnan Zhu: School of Electronic Information, Zhongyuan University of Technology, Zhengzhou, Henan, China
Weishou Jin: School of Electronic Information, Zhongyuan University of Technology, Zhengzhou, Henan, China
Zhuo Li: School of Electronic Information, Zhongyuan University of Technology, Zhengzhou, Henan, China

DOI: https://doi.org/10.7717/peerj-cs.1093
Journal volume & issue: Vol. 8
p. e1093

Abstract

Read online Read online

The Transformer shows good prospects in computer vision. However, the Swin Transformer model has the disadvantage of a large number of parameters and high computational effort. To effectively solve these problems of the model, a simplified Swin Transformer (S-Swin Transformer) model was proposed in this article for handwritten Chinese character recognition. The model simplifies the initial four hierarchical stages into three hierarchical stages. In addition, the new model increases the size of the window in the window attention; the number of patches in the window is larger; and the perceptual field of the window is increased. As the network model deepens, the size of patches becomes larger, and the perceived range of each patch increases. Meanwhile, the purpose of shifting the window’s attention is to enhance the information interaction between the window and the window. Experimental results show that the verification accuracy improves slightly as the window becomes larger. The best validation accuracy of the simplified Swin Transformer model on the dataset reached 95.70%. The number of parameters is only 8.69 million, and FLOPs are 2.90G, which greatly reduces the number of parameters and computation of the model and proves the correctness and validity of the proposed model.

Published in PeerJ Computer Science

ISSN: 2376-5992 (Online)
Publisher: PeerJ Inc.
Country of publisher: United States
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://peerj.com/computer-science/

About the journal

Abstract

Keywords