IEEE Access (Jan 2023)

A Digital Processing in Memory Architecture Using TCAM for Rapid Learning and Inference Based on a Spike Location Dependent Plasticity

  • Seong Min Kim,
  • Kyeong Min Kim,
  • Ji Hoon Choi,
  • Seung Man Kang,
  • Min Kyung Bang,
  • So Hee Park,
  • Eon Gyeong Lee,
  • Seong Bae Park,
  • Choong Seon Hong,
  • Sang Hoon Hong

DOI
https://doi.org/10.1109/ACCESS.2023.3234323
Journal volume & issue
Vol. 11
pp. 3416 – 3430

Abstract

Read online

In this paper, we present a digital processing in memory (DPIM) configured as a stride edge-detection search frequency neural network (SE-SFNN) which is trained through spike location dependent plasticity (SLDP), a learning mechanism reminiscent of spike timing dependent plasticity (STDP). This mechanism allows for rapid online learning as well as a simple memory-based implementation. In particular, we employ a ternary data scheme to take advantage of ternary content addressable memory (TCAM). The scheme utilizes a ternary representation of the image pixels, and the TCAMs are used in a two-layer format to significantly reduce the computation time. The first layer applies several filtering kernels, followed by the second layer, which reorders the pattern dictionaries of TCAMs to place the most frequent patterns at the top of each supervised TCAM dictionary. Numerous TCAM blocks in both layers operate in a massively parallel fashion using digital ternary values. There are no complicated multiply operations performed, and learning is performed in a feedforward scheme. This allows rapid and robust learning as a trade-off with the parallel memory block size. Furthermore, we propose a method to reduce the TCAM memory size using a two-tiered minor to major promotion (M2MP) of frequently occurring patterns. This reduction scheme is performed concurrently during the learning operation without incurring a preconditioning overhead. We show that with minimal circuit overhead, the required memory size is reduced by 84.4%, and the total clock cycles required for learning also decrease by 97.31 % while the accuracy decreases only by 1.12%. We classified images with 94.58% accuracy on the MNIST dataset. Using a 100 MHz clock, our simulation results show that the MNIST training takes about 6.3 ms dissipating less than 4 mW of average power. In terms of inference speed, the trained hardware is capable of processing 5,882,352 images per second.

Keywords