Gradient Estimation for Ultra Low Precision POT and Additive POT Quantization

Huruy Tesfai; Hani Saleh; Mahmoud Al-Qutayri; Baker Mohammad; Thanasios Stouraitis

doi:10.1109/ACCESS.2023.3286299

IEEE Access (Jan 2023)

Gradient Estimation for Ultra Low Precision POT and Additive POT Quantization

Huruy Tesfai,
Hani Saleh,
Mahmoud Al-Qutayri,
Baker Mohammad,
Thanasios Stouraitis

Affiliations

Huruy Tesfai: ORCiD; Department of Electrical Engineering and Computer Science, System on Chip Center, Khalifa University, Abu Dhabi, United Arab Emirates
Hani Saleh: ORCiD; Department of Electrical Engineering and Computer Science, System on Chip Center, Khalifa University, Abu Dhabi, United Arab Emirates
Mahmoud Al-Qutayri: ORCiD; Department of Electrical Engineering and Computer Science, System on Chip Center, Khalifa University, Abu Dhabi, United Arab Emirates
Baker Mohammad: ORCiD; Department of Electrical Engineering and Computer Science, System on Chip Center, Khalifa University, Abu Dhabi, United Arab Emirates
Thanasios Stouraitis: ORCiD; Department of Electrical Engineering and Computer Science, System on Chip Center, Khalifa University, Abu Dhabi, United Arab Emirates

DOI: https://doi.org/10.1109/ACCESS.2023.3286299
Journal volume & issue: Vol. 11
pp. 61264 – 61272

Abstract

Read online

Deep learning networks achieve high accuracy for many classification tasks in computer vision and natural language processing. As these models are usually over-parameterized, the computations and memory required are unsuitable for power-constrained devices. One effective technique to reduce this burden is through low-bit quantization. However, the introduced quantization error causes a drop in the classification accuracy and requires design rethinking. To benefit from the hardware-friendly power-of-two (POT) and additive POT quantization, we explore various gradient estimation methods and propose quantization error-aware gradient estimation that manoeuvres weight update to be as close to the projection steps as possible. The clipping or scaling coefficients of the quantization scheme are learned jointly with the model parameters to minimize quantization error. We also apply per-channel quantization on POT and additive POT quantized models to minimize the accuracy degradation due to the rigid resolution property of POT quantization. We show that comparable accuracy can be achieved when using the proposed gradient estimation for POT quantization, even at ultra-low bit precision.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords