IEEE Access (Jan 2022)

Optimization of Multi-Core Accelerator Performance Based on Accurate Performance Estimation

  • Sunwoo Kim,
  • Youngho Seo,
  • Sungkyung Park,
  • Chester Sungchung Park

DOI
https://doi.org/10.1109/ACCESS.2022.3151876
Journal volume & issue
Vol. 10
pp. 19629 – 19642

Abstract

Read online

Multicore accelerators have emerged to efficiently execute recent applications with complex computational dimensions. Compared to a single-core accelerator, a multicore accelerator handles a larger amount of communication and computation simultaneously. Since the conventional performance estimation algorithm tailored to single-core accelerators cannot estimate the performance of multicore accelerators accurately, we propose a novel performance estimation algorithm for a multicore accelerator. The proposed algorithm predicts a dynamic communication bandwidth of each direct memory access controller (DMAC) based on the runtime state of DMACs, making it possible to estimate the communication amounts handled by DMACs accurately by taking into account the temporal intervals. The proposed algorithm is evaluated for convolutional neural networks and wireless communications. The experimental results using a pre-register transfer level (RTL) simulator shows that the proposed algorithm can estimate the performance of a multicore accelerator with the estimation error of up to 2.8%, regardless of the system communication bandwidth. These results were also verified by the hardware implementations on Xilinx ZYNQ. In addition, the proposed algorithm is used to explore a design space of accelerator core dimensions, and the resulting optimal core dimension provides performance gains of 10.8% and 31.2%, compared to the conventional multicore accelerator and single-core accelerator, respectively. The source code is available on the GitHub repository: https://github.com/SDL-KU/OptAccTile.

Keywords