IEEE Access (Jan 2020)
NAND Flash Based Novel Synaptic Architecture for Highly Robust and High-Density Quantized Neural Networks With Binary Neuron Activation of (1, 0)
Abstract
We propose a novel synaptic architecture based on a NAND flash memory for highly robust and high-density quantized neural networks (QNN) with 4-bit weight and binary neuron activation, for the first time. The proposed synaptic architecture is fully compatible with the conventional NAND flash memory architecture by adopting a differential sensing scheme and a binary neuron activation of (1, 0). A binary neuron enables using a 1-bit sense amplifier, which significantly reduces the burden of peripheral circuits and power consumption and enables bitwise communication between the layers of neural networks. Operating NAND cells in the saturation region eliminates the effect of metal wire resistance and serial resistance of the NAND cells. With a read-verify-write (RVW) scheme, low-variance conductance distribution is demonstrated for 8 levels. Vector-matrix multiplication (VMM) of a 4-bit weight and binary activation can be accomplished by only one input pulse, eliminating the need of a multiplier and an additional logic operation. In addition, quantization training can minimize the degradation of the inference accuracy compared to post-training quantization. Finally, the low-variance conductance distribution of the NAND cells achieves a higher inference accuracy compared to that of resistive random access memory (RRAM) devices by 2~7 % and 0.04~0.23 % for CIFAR 10 and MNIST datasets, respectively.
Keywords