Sparsing Deep Neural Network Using Semi-Discrete Matrix Decomposition

Xianya Fu; Peixuan Zuo; Jia Zhai; Rui Wang; Hailong Yang; Depei Qian

doi:10.1109/ACCESS.2018.2872560

IEEE Access (Jan 2018)

Sparsing Deep Neural Network Using Semi-Discrete Matrix Decomposition

Xianya Fu,
Peixuan Zuo,
Jia Zhai,
Rui Wang,
Hailong Yang,
Depei Qian

Affiliations

Xianya Fu: ORCiD; School of Computer Science and Engineering, Beihang University, Beijing, China
Peixuan Zuo: School of Computer Science and Engineering, Beihang University, Beijing, China
Jia Zhai: Information Engineering Institute, Communication University of China, Beijing, China
Rui Wang: School of Computer Science and Engineering, Beihang University, Beijing, China
Hailong Yang: ORCiD; School of Computer Science and Engineering, Beihang University, Beijing, China
Depei Qian: School of Computer Science and Engineering, Beihang University, Beijing, China

DOI: https://doi.org/10.1109/ACCESS.2018.2872560
Journal volume & issue: Vol. 6
pp. 58673 – 58681

Abstract

Read online

Deep learning has gained a lot of successes in various areas, including computer vision, natural language process, and robot control. Convolution neural network (CNN) is the most commonly used model in deep neural networks. Despite their effectiveness on feature abstraction, CNNs need powerful computation even in the inference stage, which becomes a major obstacle in their deployment in embedded and mobile devices. In order to solve this problem, we 1) propose to make decomposition on convolution layers and full connected layers in CNNs with naïve semi-discrete matrix decomposition (SDD), which achieves the low-rank decomposition and parameters sparse at the same time; and 2) we propose a layer-merging scheme which merges two out of all the three result matrices, which can avoid the explode of the intermediate data come with the naïve semi-discrete matrix decomposition; 3) we propose a progressive training strategy to speed up the converging. We implement this optimized method in image classification and object detection networks. Under the loss of network accuracy by 1%, we achieve significant running time and model size reduction. The full-connected layer of the LeNet network achieves $7\times $ speedup in the inference stage. In the Faster-Rcnn, the weight parameters are reduced by the factor of $5.85\times $ , and it can have a speedup by the factor of $1.75\times $ .

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords