Two-Stage Model Compression and Acceleration: Optimal Student Network for Better Performance

Jialiang Tang; Ning Jiang; Wenxin Yu; Jinjia Zhou; Liuwei Mai

doi:10.1109/ACCESS.2020.3040823

IEEE Access (Jan 2020)

Two-Stage Model Compression and Acceleration: Optimal Student Network for Better Performance

Jialiang Tang,
Ning Jiang,
Wenxin Yu,
Jinjia Zhou,
Liuwei Mai

Affiliations

Jialiang Tang: ORCiD; School of Computer Science and Technology, Southwest University of Science and Technology, Mianyang, China
Ning Jiang: ORCiD; School of Computer Science and Technology, Southwest University of Science and Technology, Mianyang, China
Wenxin Yu: ORCiD; School of Computer Science and Technology, Southwest University of Science and Technology, Mianyang, China
Jinjia Zhou: ORCiD; School of Science and Engineering, Hosei University, Tokyo, Japan
Liuwei Mai: ORCiD; Sichuan Languang Development Company Ltd., Chengdu, China

DOI: https://doi.org/10.1109/ACCESS.2020.3040823
Journal volume & issue: Vol. 8
pp. 217816 – 217829

Abstract

Read online

Convolutional neural networks(CNNs) have demonstrated its advanced ability in many fields. However, the calculations and parameters of the advanced CNNs are unaffordable for exiting intelligence devices. This problem mostly hinders the practical application of CNNs. In this paper, we propose a two-stage model compression and acceleration(abbreviated as STCA) method to solve this problem. The STCA is composed of supernet and subnet, the supernet is a large pre-trained neural network with superior performance, and the subnet is obtained by pruning the supernet. More specifically, the overall process of STCA includes the search and train stage. In the search stage, we first search and remove the unnecessary channels of the supernet based on channel importance pruning to get the pruned network. Then the weights in the pruned network are initialized to get the subnet. During the training stage, the subnet will learn from the training data and the supernet together. We will extract the knowledge from the supernet and transfer it to the subnet to improve the performance of the subnet. We have proved the effectiveness of STCA by implementing extensive experiments on several advanced CNNs (VGGNet, ResNet, and DenseNet). All subnet trained by STCA achieve significant performance, especially when selecting the VGGNet-19 as the supernet, the subnet only with about 1/10 parameters and 1/2 calculations achieves 94.37% and 74.76% accuracy on the CIFAR-10 and CIFAR-100 dataset, which are 0.84% and 2.31% higher than the accuracy of the supernet.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords