IEEE Access (Jan 2023)

Most Resource Efficient Matrix Vector Multiplication on FPGAs

  • Alexander Lehnert,
  • Philipp Holzinger,
  • Simon Pfenning,
  • Ralf Muller,
  • Marc Reichenbach

DOI
https://doi.org/10.1109/ACCESS.2023.3234622
Journal volume & issue
Vol. 11
pp. 3881 – 3898

Abstract

Read online

Fast and resource-efficient inference in artificial neural networks (ANNs) is of utmost importance and drives many new developments in the area of new hardware architectures, e.g., by means of systolic arrays or algorithmic optimization such as pruning. In this paper, we present a novel method for lowering the computation effort for ANN inference utilizing ideas from information theory. Weight matrices are sliced into submatrices of logarithmic aspect ratios. These slices are then factorized. This reduces the number of required computations without compromising on fully parallel processing. We create a new hardware architecture for this dedicated purpose. We also provide a tool to map these sliced and factorized matrices efficiently to reconfigurable hardware. By comparing to the state of the art FPGA implementations, we can prove our claim by lowering hardware resources measured in look-up-tables (LUTs) by a factor of three to six. Our method does not rely on any particular property of the weight matrices of the ANN. It works for the general task of multiplying an input vector with a constant matrix and is also suitable for digital signal processing beyond ANNs.

Keywords