IEEE Access (Jan 2024)
Channel Pruning Method Based on Decoupling Feature Scale Distribution in Batch Normalization Layers
Abstract
Pruning and compression of models are practical approaches for deploying and applying deep convolutional neural networks in scenarios with limited memory and computational resources. To mitigate the impact of pruning on model accuracy and enhance the stability of pruning (defined as the negligible drop in test accuracy immediately following pruning), an algorithm for reward-penalty decoupling is introduced in this study to achieve automated sparse training and channel pruning. During sparse training, the influence of unimportant channels is automatically identified and reduced, thereby preserving the ability of the important channels for feature recognition. First, by utilizing the gradient information learned through network backpropagation, the feature scaling factors of the batch normalization layers are combined with the gradient to determine the importance threshold for the network channels. Subsequently, a two-stage sparse training algorithm is proposed based on the reward-penalty decoupling strategy, applying different loss function strategies to the feature scaling factors of “important” and “unimportant” channels during decoupled sparse training. This approach has been experimentally validated across various tasks, baselines, and datasets, demonstrating its superiority over the previous state-of-the-art methods. The results indicate that the effect of pruning on model accuracy is significantly alleviated by the proposed method, and pruned models require only limited fine-tuning to achieve excellent performance.
Keywords