IEEE Access (Jan 2023)
Modeling and Library Support for Early-Stage Exploration of Sparse Tensor Accelerator Designs
Abstract
Techniques, like pruning and dimension reduction, and characteristics of data for applications, like natural language processing and object detection, introduce sparsity in deep learning models inherently. Sparse tensor accelerators leverage sparsity (0’s) in data in order to remove ineffectual computations to speed up the overall run-time. Many researchers have suggested numerous approaches, such as encoding, decoding, non-zero extraction, load balancing and etc. However, since each implementation requires specialized hardware to accommodate the unique features, the design space becomes much larger compared to regular tensor accelerators when designing a new sparse accelerator. Also, these features are hard to compare since the efficiency varies according to application and data sparsity. In this paper, we classify and support the modeling of popularly used features for sparse tensor accelerators. These features enable us to model much larger design space and to estimate their cost more accurately. Library support for these features is also included to make early-stage exploration more realistic. Overall, our experiments show that we can analytically estimate the previously un-modeled components with 93% accuracy on average and provide 19 features as library support.
Keywords