IEEE Journal of the Electron Devices Society (Jan 2024)

A 3-D Bank Memory System for Low-Power Neural Network Processing Achieved by Instant Context Switching and Extended Power Gating Time

  • Kouhei Toyotaka,
  • Yuto Yakubo,
  • Kazuma Furutani,
  • Haruki Katagiri,
  • Masashi Fujita,
  • Yoshinori Ando,
  • Toru Nakura,
  • Shunpei Yamazaki

DOI
https://doi.org/10.1109/JEDS.2024.3418036
Journal volume & issue
Vol. 12
pp. 486 – 494

Abstract

Read online

Using a 3-D monolithic stacking memory technology of crystalline oxide semiconductor (OS) transistors, we fabricated a test chip having AI accelerator (ACC) memory for weight data of a neural network (NN), backup memory of flip-flops (FF), and CPU memory storing instructions and data. These memories are composed of two-layer OS transistors on Si CMOS, where memories in each layer correspond to a bank. In this structure, bank switching of the ACC memory and the FF backup memory work together, and thus inference of different NNs is switched with low latency and low power so that the power gating standby time can be extended. Consequently, a 92% reduction in power consumption is achieved in inference at a frame rate of 60 fps as compared with a chip using static random access memory (SRAM) as the ACC memory.

Keywords