IEEE Access (Jan 2023)
An In-Situ Dynamic Quantization With 3D Stacking Synaptic Memory for Power-Aware Neuromorphic Architecture
Abstract
Spiking Neural Networks (SNNs) show their potential for lightweight low-power inferences because they mimic the functionality of the biological brain. However, one of the major challenges of SNNs like other neural networks is memory-wall and power-wall when accessing data (synaptic weights) from memory. It limits the potential of spiking neural networks implemented on edge devices. In this paper, we present a novel spiking computing hardware architecture named NASH-3DM using 3D-IC-based stacking memory with power supply awareness to effectively decrease power consumption for AI-enabled edge devices. Instead of storing one or multiple weights in a single memory word, we split them into small subsets and allocate each subset into a separate memory in every stacking layer. With the natural separation of stack layers, our system can activate and deactivate each layer separately. Therefore, it can offer in-situ (online, post-manufacture, and without interruption) dynamic quantization with multiple operating modes. With the CMOS 45nm technology, our energy per synaptic operation for MNIST classification can reduce by 36.67% while having 0.93%-1.14% accuracy loss at 5-bit quantization. The energy per synaptic operation reduction for the CIFAR10 dataset is 36.68% when switching from the 16-bit active operation to the in-situ 10-bit one with an accuracy loss of 5.69%.
Keywords