Most Resource Efficient Matrix Vector Multiplication on FPGAs

Alexander Lehnert; Philipp Holzinger; Simon Pfenning; Ralf Muller; Marc Reichenbach

doi:10.1109/ACCESS.2023.3234622

IEEE Access (Jan 2023)

Most Resource Efficient Matrix Vector Multiplication on FPGAs

Alexander Lehnert,
Philipp Holzinger,
Simon Pfenning,
Ralf Muller,
Marc Reichenbach

Affiliations

Alexander Lehnert: ORCiD; Chair of Computer Engineering, Brandenburg University of Technology Cottbus-Senftenberg, Cottbus, Germany
Philipp Holzinger: ORCiD; Chair of Computer Architecture, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
Simon Pfenning: ORCiD; Chair of Computer Architecture, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
Ralf Muller: ORCiD; Institute for Digital Communications, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
Marc Reichenbach: ORCiD; Chair of Computer Engineering, Brandenburg University of Technology Cottbus-Senftenberg, Cottbus, Germany

DOI: https://doi.org/10.1109/ACCESS.2023.3234622
Journal volume & issue: Vol. 11
pp. 3881 – 3898

Abstract

Read online

Fast and resource-efficient inference in artificial neural networks (ANNs) is of utmost importance and drives many new developments in the area of new hardware architectures, e.g., by means of systolic arrays or algorithmic optimization such as pruning. In this paper, we present a novel method for lowering the computation effort for ANN inference utilizing ideas from information theory. Weight matrices are sliced into submatrices of logarithmic aspect ratios. These slices are then factorized. This reduces the number of required computations without compromising on fully parallel processing. We create a new hardware architecture for this dedicated purpose. We also provide a tool to map these sliced and factorized matrices efficiently to reconfigurable hardware. By comparing to the state of the art FPGA implementations, we can prove our claim by lowering hardware resources measured in look-up-tables (LUTs) by a factor of three to six. Our method does not rely on any particular property of the weight matrices of the ANN. It works for the general task of multiplying an input vector with a constant matrix and is also suitable for digital signal processing beyond ANNs.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords