IEEE Access (Jan 2019)

Prune Deep Neural Networks With the Modified <inline-formula> <tex-math notation="LaTeX">$L_{1/2}$ </tex-math></inline-formula> Penalty

  • Jing Chang,
  • Jin Sha

DOI
https://doi.org/10.1109/ACCESS.2018.2886876
Journal volume & issue
Vol. 7
pp. 2273 – 2280

Abstract

Read online

Demands to deploy deep neural network (DNN) models on mobile devices and embedded systems have drastically grown in recent years. When transplanting DNN models to such platforms, requirements pertaining to computation and memory use are bottlenecks. To overcome them, network pruning has been carefully studied as a method of network compression. The effectiveness of network pruning is significantly affected by incorrect pruning on important connections. In this paper, we propose a network pruning method based on the modified $L_{1/2}$ penalty that reduces incorrect pruning by increasing the sparsity of the pretrained models. The modified $L_{1/2}$ penalty yields better sparsity than the $L_{1}$ penalty at a similar computational cost. Compared with the past work that numerically defines the importance of connections and re-establishes important weights when incorrect pruning occurs, our method achieves faster convergence by using a simpler pruning strategy. The results of experiments show that our method can compress LeNet300-100, LeNet-5, ResNet, AlexNet, and VGG16 by factors of $66\times $ , $322\times $ , $26\times $ , $21\times $ , and $16\times $ , respectively, with negligible loss of accuracy.

Keywords