Frontiers in Computational Neuroscience (Jul 2021)

Toward Software-Equivalent Accuracy on Transformer-Based Deep Neural Networks With Analog Memory Devices

  • Katie Spoon,
  • Hsinyu Tsai,
  • An Chen,
  • Malte J. Rasch,
  • Stefano Ambrogio,
  • Charles Mackin,
  • Andrea Fasoli,
  • Alexander M. Friz,
  • Pritish Narayanan,
  • Milos Stanisavljevic,
  • Geoffrey W. Burr

DOI
https://doi.org/10.3389/fncom.2021.675741
Journal volume & issue
Vol. 15

Abstract

Read online

Recent advances in deep learning have been driven by ever-increasing model sizes, with networks growing to millions or even billions of parameters. Such enormous models call for fast and energy-efficient hardware accelerators. We study the potential of Analog AI accelerators based on Non-Volatile Memory, in particular Phase Change Memory (PCM), for software-equivalent accurate inference of natural language processing applications. We demonstrate a path to software-equivalent accuracy for the GLUE benchmark on BERT (Bidirectional Encoder Representations from Transformers), by combining noise-aware training to combat inherent PCM drift and noise sources, together with reduced-precision digital attention-block computation down to INT6.

Keywords