Roulette: A Pruning Framework to Train a Sparse Neural Network From Scratch

Qiaoling Zhong; Zhibin Zhang; Qiang Qiu; Xueqi Cheng

doi:10.1109/ACCESS.2021.3065406

IEEE Access (Jan 2021)

Roulette: A Pruning Framework to Train a Sparse Neural Network From Scratch

Qiaoling Zhong,
Zhibin Zhang,
Qiang Qiu,
Xueqi Cheng

Affiliations

Qiaoling Zhong: ORCiD; CAS Key Laboratory of Network Data Science and Technology, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Zhibin Zhang: School of Computer and Control Engineering, University of Chinese Academy of Sciences, Beijing, China
Qiang Qiu: CAS Key Laboratory of Network Data Science and Technology, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Xueqi Cheng: CAS Key Laboratory of Network Data Science and Technology, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China

DOI: https://doi.org/10.1109/ACCESS.2021.3065406
Journal volume & issue: Vol. 9
pp. 51134 – 51145

Abstract

Read online

Due to space and inference time restrictions, finding an efficient and sparse sub-network from a dense and over-parameterized network is critical for deploying neural networks on edge devices. Recent efforts explore obtaining a sparse sub-network by performing network pruning during training procedures to reduce training costs, such as memory and floating-point operations (FLOPs). However, these works take more than $1.4\times $ the total number of iterations and try all possible pruning parameters manually to obtain sparse sub-networks. In this paper, we present a pruning framework Roulette to train a sparse network from scratch. First, we propose a novel method to train a sparse network by Pruning through the lens of Loss Landscape iteratively and automatically (PLL). We do a theoretical analysis that the curvature of the loss function is higher in the initial phase and can conduct us to start network pruning. According to our results on CIFAR-10/100 and ImageNet dataset, PLL saves up to $4\times $ training FLOPs than prior works while maintaining comparable or even better accuracy. Then we design push and pull operations to synchronize the pruned weights on different GPUs during training, scaling PLL to multiple GPUs linearly. To our knowledge, Roulette is the first network pruning framework supporting multiple GPUs linearly.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords