IEEE Access (Jan 2023)

A Channel Pruning Optimization With Layer-Wise Sensitivity in a Single-Shot Manner Under Computational Constraints

  • Minsu Jeon,
  • Taewoo Kim,
  • Changha Lee,
  • Chan-Hyun Youn

DOI
https://doi.org/10.1109/ACCESS.2022.3232566
Journal volume & issue
Vol. 11
pp. 7043 – 7055

Abstract

Read online

In the constrained computing environments such as mobile device or satellite on-board system, various computational factors of hardware resource can restrict the processing of deep learning (DL) services. Recent DL models such as satellite image analysis mainly require larger resource memory occupation for intermediate feature map footprint than the given memory specification of hardware resource and larger computational overhead (in FLOP) to meet service-level objective in the sense of hardware accelerator. As one of the solutions, we propose a new method of controlling the layer-wise channel pruning in a single-shot manner that can decide how much channels to prune in each layer by observing dataset once without full pretraining. To improve the robustness of the performance degradation, we also propose a layer-wise sensitivity and formulate the optimization problems for deciding layer-wise pruning ratio under target computational constraints. In the paper, the optimal conditions are theoretically derived, and the practical optimum searching schemes are proposed using the optimal conditions. On the empirical evaluation, the proposed methods show robustness on performance degradation, and present feasibility on DL serving under constrained computing environments by reducing memory occupation, providing acceleration effect and throughput improvement while keeping the accuracy performance.

Keywords