Dianxin kexue (Sep 2024)

Swin Transformer lightweight: an efficient strategy that combines weight sharing, distillation and pruning

  • HAN Bo,
  • ZHOU Shun,
  • FAN Jianhua,
  • WEI Xianglin,
  • HU Yongyang,
  • ZHU Yanping

Journal volume & issue
Vol. 40
pp. 66 – 74

Abstract

Read online

Swin Transformer, as a layered visual transformer with shifted windows, has attracted extensive attention in the field of computer vision due to its exceptional modeling capabilities. However, its high computational complexity limits its applicability on devices with constrained computational resources. To address this issue, a pruning compression method was proposed, integrating weight sharing and distillation. Initially, weight sharing was implemented across layers, and transformation layers were added to introduce weight transformation, thereby enhancing diversity. Subsequently, a parameter dependency mapping graph for the transformation blocks was constructed and analyzed, and a grouping matrix F was built to record the dependency relationships among all parameters and identify parameters for simultaneous pruning. Finally, distillation was then employed to restore the model’s performance. Experiments conducted on the ImageNet-Tiny-200 public dataset demonstrate that, with a reduction of 32% in model computational complexity, the proposed method only results in approximately a 3% performance degradation at minimum. It provides a solution for deploying high-performance artificial intelligence models in environments with limited computational resources.

Keywords