IEEE Journal on Exploratory Solid-State Computational Devices and Circuits (Jan 2021)
Ferroelectric Field-Effect Transistor-Based 3-D NAND Architecture for Energy-Efficient on-Chip Training Accelerator
Abstract
Different from the deep neural network (DNN) inference process, the training process produces a huge amount of intermediate data to compute the new weights of the network. Generally, the on-chip global buffer (e.g., SRAM cache) has limited capacity because of its low memory density; therefore, off-chip DRAM access is inevitable during the training sequences. In this work, a novel ferroelectric field-effect transistor (FeFET)-based 3-D NAND architecture for on-chip training accelerator is proposed. The reduced peripheral circuit overheads due to the low operation voltage of the FeFET device and ultrahigh density of 3-D NAND architecture enable storing and computing all the intermediate data on chip during the training process. We present a custom design of a 108-Gb chip with a 59.91-mm2 area with 45% array efficiency. The relevant data mapping schemes for weights/activations/errors that are compatible with the 3-D NAND architecture are investigated. The training performance was explored, while the ResNet-18 model is trained on this architecture with the ImageNet data set by 8-bit precision. Due to the minimized off-chip memory access, 7.76 TOPS/W of energy efficiency was achieved for 8-bit on-chip training.
Keywords