Auto Batching Scheme for Optimizing LSTM Inference on FPGA Platforms

Byoung Jin Kim; Eui-Young Chung

doi:10.1109/ACCESS.2024.3488033

IEEE Access (Jan 2024)

Auto Batching Scheme for Optimizing LSTM Inference on FPGA Platforms

Byoung Jin Kim,
Eui-Young Chung

Affiliations

Byoung Jin Kim: ORCiD; Department of Electrical and Electronic Engineering, Yonsei University, Seoul, South Korea
Eui-Young Chung: ORCiD; Department of Electrical and Electronic Engineering, Yonsei University, Seoul, South Korea

DOI: https://doi.org/10.1109/ACCESS.2024.3488033
Journal volume & issue: Vol. 12
pp. 159380 – 159394

Abstract

Read online

This paper presents an innovative auto batching scheme designed to optimize Long Short-Term Memory (LSTM) inference on Field-Programmable Gate Array (FPGA) platforms. Existing block batching methods face challenges with LSTM models that have large hidden sizes due to insufficient on-chip memory, which impedes prefetching and leads to repeated evictions and reloads, significantly reducing processing utilization. Our approach extends block batching with weight stationary block batching (WSBB), allowing computation without stalls regardless of prefetch availability.Additionally, bypass-enabled block batching (BEBB) ensures that even when on-chip memory is insufficient, it prevents contamination on-chip while fully leveraging off-chip memory bandwidth. Experimental results from both synthetic benchmarks (Deepbench suite) and real-world applications (RNN-T) validate the superior performance and efficiency of the proposed method. Our auto batching scheme demonstrates up to 3.7 times speedup over previous block batching while maintaining high computational efficiency, even with limited on-chip memory. Furthermore, the FPGA-based implementation of our scheme achieves a 5 times speedup over the CPU and 4.3 times higher energy efficiency (GFLOP/s/W) compared to the GPU.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords