Mathematics (Apr 2023)

Learning Bilateral Clipping Parametric Activation for Low-Bit Neural Networks

  • Yunlong Ding,
  • Di-Rong Chen

DOI
https://doi.org/10.3390/math11092001
Journal volume & issue
Vol. 11, no. 9
p. 2001

Abstract

Read online

Among various network compression methods, network quantization has developed rapidly due to its superior compression performance. However, trivial activation quantization schemes limit the compression performance of network quantization. Most conventional activation quantization methods directly utilize the rectified activation functions to quantize models, yet their unbounded outputs generally yield drastic accuracy degradation. To tackle this problem, we propose a comprehensive activation quantization technique namely Bilateral Clipping Parametric Rectified Linear Unit (BCPReLU) as a generalized version of all rectified activation functions, which limits the quantization range more flexibly during training. Specifically, trainable slopes and thresholds are introduced for both positive and negative inputs to find more flexible quantization scales. We theoretically demonstrate that BCPReLU has approximately the same expressive power as the corresponding unbounded version and establish its convergence in low-bit quantization networks. Extensive experiments on a variety of datasets and network architectures demonstrate the effectiveness of our trainable clipping activation function.

Keywords