Structured Sparsity of Convolutional Neural Networks via Nonconvex Sparse Group Regularization

Kevin Bui; Fredrick Park; Shuai Zhang; Yingyong Qi; Jack Xin

doi:10.3389/fams.2020.529564

Frontiers in Applied Mathematics and Statistics (Feb 2021)

Structured Sparsity of Convolutional Neural Networks via Nonconvex Sparse Group Regularization

Kevin Bui,
Fredrick Park,
Shuai Zhang,
Yingyong Qi,
Jack Xin

Affiliations

Kevin Bui: Department of Mathematics, University of California, Irvine, Irvine, CA, United States
Fredrick Park: Department of Mathematics and Computer Science, Whittier College, Whittier, CA, United States
Shuai Zhang: Department of Mathematics, University of California, Irvine, Irvine, CA, United States
Yingyong Qi: Department of Mathematics, University of California, Irvine, Irvine, CA, United States
Jack Xin: Department of Mathematics, University of California, Irvine, Irvine, CA, United States

DOI: https://doi.org/10.3389/fams.2020.529564
Journal volume & issue: Vol. 6

Abstract

Read online

Convolutional neural networks (CNN) have been hugely successful recently with superior accuracy and performance in various imaging applications, such as classification, object detection, and segmentation. However, a highly accurate CNN model requires millions of parameters to be trained and utilized. Even to increase its performance slightly would require significantly more parameters due to adding more layers and/or increasing the number of filters per layer. Apparently, many of these weight parameters turn out to be redundant and extraneous, so the original, dense model can be replaced by its compressed version attained by imposing inter- and intra-group sparsity onto the layer weights during training. In this paper, we propose a nonconvex family of sparse group lasso that blends nonconvex regularization (e.g., transformed ℓ1, ℓ1−ℓ2, and ℓ0) that induces sparsity onto the individual weights and ℓ2,1 regularization onto the output channels of a layer. We apply variable splitting onto the proposed regularization to develop an algorithm that consists of two steps per iteration: gradient descent and thresholding. Numerical experiments are demonstrated on various CNN architectures showcasing the effectiveness of the nonconvex family of sparse group lasso in network sparsification and test accuracy on par with the current state of the art.

Published in Frontiers in Applied Mathematics and Statistics

ISSN: 2297-4687 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Applied mathematics. Quantitative methods; Science: Mathematics: Probabilities. Mathematical statistics
Website: http://journal.frontiersin.org/journal/applied-mathematics-and-statistics#

About the journal

Abstract

Keywords