Data-Aware Adaptive Pruning Model Compression Algorithm Based on a Group Attention Mechanism and Reinforcement Learning

Zhi Yang; Yuan Zhai; Yi Xiang; Jianquan Wu; Jinliang Shi; Ying Wu

doi:10.1109/ACCESS.2022.3188119

IEEE Access (Jan 2022)

Data-Aware Adaptive Pruning Model Compression Algorithm Based on a Group Attention Mechanism and Reinforcement Learning

Zhi Yang,
Yuan Zhai,
Yi Xiang,
Jianquan Wu,
Jinliang Shi,
Ying Wu

Affiliations

Zhi Yang: ORCiD; School of Intelligent Technology and Engineering, Chongqing University of Science and Technology, Chongqing, China
Yuan Zhai: School of Intelligent Technology and Engineering, Chongqing University of Science and Technology, Chongqing, China
Yi Xiang: School of Intelligent Technology and Engineering, Chongqing University of Science and Technology, Chongqing, China
Jianquan Wu: School of Intelligent Technology and Engineering, Chongqing University of Science and Technology, Chongqing, China
Jinliang Shi: School of Intelligent Technology and Engineering, Chongqing University of Science and Technology, Chongqing, China
Ying Wu: ORCiD; School of Intelligent Technology and Engineering, Chongqing University of Science and Technology, Chongqing, China

DOI: https://doi.org/10.1109/ACCESS.2022.3188119
Journal volume & issue: Vol. 10
pp. 82396 – 82406

Abstract

Read online

The success of convolutional neural networks (CNNs) benefits from the stacking of convolutional layers, which improves the model’s receptive field for image data but also causes a decrease in inference speed. To improve the inference speed of large convolutional network models without sacrificing performance indicators too much, a data-aware adaptive pruning algorithm is proposed. The algorithm consists of two parts, namely, a channel pruning method based on the attention mechanism and a data-aware pruning policy based on reinforcement learning. Experimental results on the CIFAR-100 dataset show that the performance of the proposed pruning algorithm is reduced by only 2.05%, 1.93% and 5.66% after pruning the VGG19, ResNet56 and EfficientNet networks, respectively, but the speedup ratios are 3.63, 3.35, and 1.14, respectively, and the comprehensive pruning performance is the best. In addition, the generalization ability of the reconstruction model is evaluated on the ImageNet dataset and FGVC Aircraft dataset, and the performance of the proposed algorithm is the best, which shows that the proposed algorithm learns data-related information in the pruning process, that is, it is a data-aware algorithm.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords