FARANE-Q: Fast Parallel and Pipeline Q-Learning Accelerator for Configurable Reinforcement Learning SoC

Nana Sutisna; Andi M. Riyadhus Ilmy; Infall Syafalni; Rahmat Mulyawan; Trio Adiono

doi:10.1109/ACCESS.2022.3232853

IEEE Access (Jan 2023)

FARANE-Q: Fast Parallel and Pipeline Q-Learning Accelerator for Configurable Reinforcement Learning SoC

Nana Sutisna,
Andi M. Riyadhus Ilmy,
Infall Syafalni,
Rahmat Mulyawan,
Trio Adiono

Affiliations

Nana Sutisna: School of Electrical Engineering and Informatics, Institut Teknologi Bandung, Bandung, Indonesia
Andi M. Riyadhus Ilmy: School of Electrical Engineering and Informatics, Institut Teknologi Bandung, Bandung, Indonesia
Infall Syafalni: ORCiD; School of Electrical Engineering and Informatics, Institut Teknologi Bandung, Bandung, Indonesia
Rahmat Mulyawan: ORCiD; School of Electrical Engineering and Informatics, Institut Teknologi Bandung, Bandung, Indonesia
Trio Adiono: ORCiD; School of Electrical Engineering and Informatics, Institut Teknologi Bandung, Bandung, Indonesia

DOI: https://doi.org/10.1109/ACCESS.2022.3232853
Journal volume & issue: Vol. 11
pp. 144 – 161

Abstract

Read online

This paper proposes a FAst paRAllel and pipeliNE Q-learning accelerator (FARANE-Q) for a configurable Reinforcement Learning (RL) algorithm implemented in a System on Chip (SoC). The proposed work offers flexibility, configurability, and scalability while maintaining computation speed and accuracy to overcome the challenges of a dynamic environment and increasing complexity. The proposed method includes a Hardware/Software (HW/SW) design methodology for the SoC architecture to achieve flexibility. We also propose joint optimizations on the algorithm, architecture, and implementation to obtain optimum (high efficiency) performance, specifically in energy and area efficiency. Furthermore, we implemented the proposed design in a real-time Zynq Ultra96-V2 FPGA platform to evaluate the functionality with an actual use case of smart navigation. Experimental results confirm that the proposed accelerator FARANE-Q outperforms state-of-the-art works by achieving a throughput of up to 148.55 MSps. It corresponds to the energy efficiency of 1747.64 MSps/W per agent for 32-bit and 2424.33 MSps/W per agent for 16-bit FARANE-Q. Moreover, the proposed 16-bit FARANE-Q outperforms other related works by an improvement of at least $1.23\times $ in energy efficiency. The designed system also maintains an error accuracy of less than 0.4% with optimized bit precision for more than eight fraction bits. The proposed FARANE-Q also offers a speed up of processing time up to $1795\times $ compared to embedded SW computation executed on ARM Zynq processor and $280\times $ of computation of full software executed on i7 processor. Hence, the proposed work has the potential to be used for smart navigation, robotic control, and predictive maintenance.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords