PRAP-PIM: A weight pattern reusing aware pruning method for ReRAM-based PIM DNN accelerators

Zhaoyan Shen; Jinhao Wu; Xikun Jiang; Yuhao Zhang; Lei Ju; Zhiping Jia

High-Confidence Computing (Jun 2023)

PRAP-PIM: A weight pattern reusing aware pruning method for ReRAM-based PIM DNN accelerators

Zhaoyan Shen,
Jinhao Wu,
Xikun Jiang,
Yuhao Zhang,
Lei Ju,
Zhiping Jia

Affiliations

Zhaoyan Shen: School of Computer Science and Technology, Shandong University, Qindao 266237, China
Jinhao Wu: School of Computer Science and Technology, Shandong University, Qindao 266237, China
Xikun Jiang: School of Computer Science and Technology, Shandong University, Qindao 266237, China
Yuhao Zhang: Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
Lei Ju: School of Cyber Science and Technology, Shandong University, Qindao 266237, China
Zhiping Jia: School of Computer Science and Technology, Shandong University, Qindao 266237, China; Corresponding author.

Journal volume & issue: Vol. 3, no. 2
p. 100123

Abstract

Read online

Resistive Random-Access Memory (ReRAM) based Processing-in-Memory (PIM) frameworks are proposed to accelerate the working process of DNN models by eliminating the data movement between the computing and memory units. To further mitigate the space and energy consumption, DNN model weight sparsity and weight pattern repetition are exploited to optimize these ReRAM-based accelerators. However, most of these works only focus on one aspect of this software/hardware co-design framework and optimize them individually, which makes the design far from optimal. In this paper, we propose PRAP-PIM, which jointly exploits the weight sparsity and weight pattern repetition by using a weight pattern reusing aware pruning method. By relaxing the weight pattern reusing precondition, we propose a similarity-based weight pattern reusing method that can achieve a higher weight pattern reusing ratio. Experimental results show that PRAP-PIM achieves 1.64× performance improvement and 1.51× energy efficiency improvement in popular deep learning benchmarks, compared with the state-of-the-art ReRAM-based DNN accelerators.

Published in High-Confidence Computing

ISSN: 2667-2952 (Online)
Publisher: Elsevier
Country of publisher: Netherlands
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.journals.elsevier.com/high-confidence-computing

About the journal

Abstract

Keywords