A Ferroelectric FET-Based Processing-in-Memory Architecture for DNN Acceleration

Yun Long; Daehyun Kim; Edward Lee; Priyabrata Saha; Burhan Ahmad Mudassar; Xueyuan She; Asif Islam Khan; Saibal Mukhopadhyay

doi:10.1109/JXCDC.2019.2923745

IEEE Journal on Exploratory Solid-State Computational Devices and Circuits (Jan 2019)

A Ferroelectric FET-Based Processing-in-Memory Architecture for DNN Acceleration

Yun Long,
Daehyun Kim,
Edward Lee,
Priyabrata Saha,
Burhan Ahmad Mudassar,
Xueyuan She,
Asif Islam Khan,
Saibal Mukhopadhyay

Affiliations

Yun Long: ORCiD; School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, USA
Daehyun Kim: School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, USA
Edward Lee: School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, USA
Priyabrata Saha: ORCiD; School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, USA
Burhan Ahmad Mudassar: ORCiD; School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, USA
Xueyuan She: School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, USA
Asif Islam Khan: ORCiD; School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, USA
Saibal Mukhopadhyay: School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, USA

DOI: https://doi.org/10.1109/JXCDC.2019.2923745
Journal volume & issue: Vol. 5, no. 2
pp. 113 – 122

Abstract

Read online

This paper presents a ferroelectric FET (FeFET)-based processing-in-memory (PIM) architecture to accelerate the inference of deep neural networks (DNNs). We propose a digital in-memory vector-matrix multiplication (VMM) engine design utilizing the FeFET crossbar to enable bit-parallel computation and eliminate analog-to-digital conversion in prior mixed-signal PIM designs. A dedicated hierarchical network-on-chip (H-NoC) is developed for input broadcasting and on-the-fly partial results processing, reducing the data transmission volume and latency. Simulations in 28-nm CMOS technology show 115× and 6.3× higher computing efficiency (GOPs/W) over desktop GPU (Nvidia GTX 1080Ti) and resistive random access memory (ReRAM)-based design, respectively.

Published in IEEE Journal on Exploratory Solid-State Computational Devices and Circuits

ISSN: 2329-9231 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6570653

About the journal

Abstract

Keywords