EPJ Web of Conferences (Jan 2024)

KServe inference extension for an FPGA vendor-free ecosystem

  • Ciangottini Diego,
  • Bianchini Giulio,
  • Mariotti Mirko,
  • Spiga Daniele,
  • Storchi Loriano,
  • Surace Giacomo

DOI
https://doi.org/10.1051/epjconf/202429511012
Journal volume & issue
Vol. 295
p. 11012

Abstract

Read online

Field Programmable Gate Arrays (FPGAs) are playing an increasingly important role in the sampling and data processing industry due to their intrinsically highly parallel architecture, low power consumption, and flexibility to execute custom algorithms. In particular, the use of FPGAs to perform Machine Learning (ML) inference is increasingly growing thanks to the development of High-Level Synthesis (HLS) projects that abstract the complexity of Hardware Description Language (HDL) programming. In this work we will describe our experience extending KServe predictors, an emerging standard for ML model inference as a service on kubernetes. This project will support a custom workflow capable of loading and serving models on-demand on top of FPGAs. A key aspect of the proposed approach is to make the firmware generation, often an obstacle to a widespread FPGA adoption, transparent. We will detail how the proposed system automates both the synthesis of the HDL code and the generation of the firmware, starting from a high-level language and user-friendly machine learning libraries. The ecosystem is then completed with the adoption of a common language for sharing user models and firmwares, that is based on a dedicated Open Container Initiative artifact definition, thus leveraging all the well established practices on managing resources on a container registry.