StarSPA: Stride-Aware Sparsity Compression for Efficient CNN Acceleration

Ngoc-Son Pham; Sangwon Shin; Lei Xu; Weidong Shi; Taeweon Suh

doi:10.1109/ACCESS.2024.3353313

IEEE Access (Jan 2024)

StarSPA: Stride-Aware Sparsity Compression for Efficient CNN Acceleration

Ngoc-Son Pham,
Sangwon Shin,
Lei Xu,
Weidong Shi,
Taeweon Suh

Affiliations

Ngoc-Son Pham: ORCiD; Department of Computer Science and Engineering, Korea University, Seoul, South Korea
Sangwon Shin: Department of Computer Science and Engineering, Korea University, Seoul, South Korea
Lei Xu: ORCiD; Department of Computer Science, Kent State University, Kent, OH, USA
Weidong Shi: Department of Computer Science, University of Houston, Houston, TX, USA
Taeweon Suh: ORCiD; Department of Computer Science and Engineering, Korea University, Seoul, South Korea

DOI: https://doi.org/10.1109/ACCESS.2024.3353313
Journal volume & issue: Vol. 12
pp. 10893 – 10909

Abstract

Read online

The presence of sparsity in both input features and weights within convolutional neural networks offers a valuable opportunity to significantly reduce the number of computations required during inference. Moreover, the practice of compressing input data serves to diminish storage requirements and lower data transfer costs, ultimately enhancing overall power efficiency. However, the compression of randomly sparse inputs introduces challenges in the input matching process, often resulting in substantial hardware overhead and increased power consumption. These challenges arise due to the irregular nature of sparse inputs and the variability in convolutional strides. In response to these challenges, this research introduces an innovative data compression method, named Stride-Aware Sparsity Compression (StarSPA), designed to effectively locate valid input values and expedite the multiplication process. To fully capitalize on this proposed compression method, a weight-stationary approach is employed for efficient convolution. Comprehensive simulations demonstrate that the proposed accelerator achieves speedup factors of $1.17\times $ , $1.05\times $ , $1.09\times $ , $1.23\times $ , and $1.12\times $ when compared to the recent accelerator named SparTen for AlexNet, VGG16, GoogLeNet, ResNet34, and EfficientNetV2, respectively. Furthermore, FPGA implementation of the core reveals a noteworthy $2.55\times $ reduction in hardware size and a $5\times $ enhancement in energy efficiency when compared to SparTen.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords