Channel Pruning Method Based on Decoupling Feature Scale Distribution in Batch Normalization Layers

Zijie Qiu; Peng Wei; Mingwei Yao; Rui Zhang; Yingchun Kuang

doi:10.1109/ACCESS.2024.3382994

IEEE Access (Jan 2024)

Channel Pruning Method Based on Decoupling Feature Scale Distribution in Batch Normalization Layers

Zijie Qiu,
Peng Wei,
Mingwei Yao,
Rui Zhang,
Yingchun Kuang

Affiliations

Zijie Qiu: ORCiD; College of Information and Intelligence, Hunan Agricultural University, Changsha, China
Peng Wei: College of Information and Intelligence, Hunan Agricultural University, Changsha, China
Mingwei Yao: College of Information and Intelligence, Hunan Agricultural University, Changsha, China
Rui Zhang: College of Information and Intelligence, Hunan Agricultural University, Changsha, China
Yingchun Kuang: ORCiD; College of Information and Intelligence, Hunan Agricultural University, Changsha, China

DOI: https://doi.org/10.1109/ACCESS.2024.3382994
Journal volume & issue: Vol. 12
pp. 48865 – 48880

Abstract

Read online

Pruning and compression of models are practical approaches for deploying and applying deep convolutional neural networks in scenarios with limited memory and computational resources. To mitigate the impact of pruning on model accuracy and enhance the stability of pruning (defined as the negligible drop in test accuracy immediately following pruning), an algorithm for reward-penalty decoupling is introduced in this study to achieve automated sparse training and channel pruning. During sparse training, the influence of unimportant channels is automatically identified and reduced, thereby preserving the ability of the important channels for feature recognition. First, by utilizing the gradient information learned through network backpropagation, the feature scaling factors of the batch normalization layers are combined with the gradient to determine the importance threshold for the network channels. Subsequently, a two-stage sparse training algorithm is proposed based on the reward-penalty decoupling strategy, applying different loss function strategies to the feature scaling factors of “important” and “unimportant” channels during decoupled sparse training. This approach has been experimentally validated across various tasks, baselines, and datasets, demonstrating its superiority over the previous state-of-the-art methods. The results indicate that the effect of pruning on model accuracy is significantly alleviated by the proposed method, and pruned models require only limited fine-tuning to achieve excellent performance.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords