Труды Института системного программирования РАН (Oct 2018)
Optimizations in Dynamic Binary Translation
Abstract
We suggest using OpenCL standard for programming FPGA devices that are used as accelerators in a heterogeneous system. We describe the implementation of a subset of OpenCL that is required for organizing data exchange and task management for FPGAs given that CPU and FPGA are connected via PCI-express bus. Basically, the first part of the required functions is the simple device manipulation and FPGA program loading; the latter requires flashing the FPGA via the JTAG interface. The second part is the memory buffer transfer to and from the FPGA. Its implementation in the runtime library is straightforward given that the FPGA supports PCI-express exchanges; the main load falls onto the FPGA driver and the FPGA system-level firmware organizing these exchanges. The final part is the FPGA task management that is achieved via the simple task scheduler implemented within the FPGA driver. The code running on FPGA can be created with a hardware description language or generated automatically using one of the known translators, e.g. C-to-Verilog, but it should adhere to the ABI described by the FPGA driver and firmware implementations.