IEEE Access (Jan 2018)
Sparsing Deep Neural Network Using Semi-Discrete Matrix Decomposition
Abstract
Deep learning has gained a lot of successes in various areas, including computer vision, natural language process, and robot control. Convolution neural network (CNN) is the most commonly used model in deep neural networks. Despite their effectiveness on feature abstraction, CNNs need powerful computation even in the inference stage, which becomes a major obstacle in their deployment in embedded and mobile devices. In order to solve this problem, we 1) propose to make decomposition on convolution layers and full connected layers in CNNs with naïve semi-discrete matrix decomposition (SDD), which achieves the low-rank decomposition and parameters sparse at the same time; and 2) we propose a layer-merging scheme which merges two out of all the three result matrices, which can avoid the explode of the intermediate data come with the naïve semi-discrete matrix decomposition; 3) we propose a progressive training strategy to speed up the converging. We implement this optimized method in image classification and object detection networks. Under the loss of network accuracy by 1%, we achieve significant running time and model size reduction. The full-connected layer of the LeNet network achieves $7\times $ speedup in the inference stage. In the Faster-Rcnn, the weight parameters are reduced by the factor of $5.85\times $ , and it can have a speedup by the factor of $1.75\times $ .
Keywords