Accelerate Scientific Deep Learning Models on Heterogeneous Computing Platform with FPGA

Jiang Chao; Ojika David; Vallecorsa Sofia; Kurth Thorsten; Prabhat; Patel Bhavesh; Lam Herman

doi:10.1051/epjconf/202024509014

EPJ Web of Conferences (Jan 2020)

Accelerate Scientific Deep Learning Models on Heterogeneous Computing Platform with FPGA

Jiang Chao,
Ojika David,
Vallecorsa Sofia,
Kurth Thorsten,
Prabhat,
Patel Bhavesh,
Lam Herman

Affiliations

Jiang Chao: SHREC: NSF Center for Space, High-Performance, and Resilient Computing, University of Florida
Ojika David
Vallecorsa Sofia: CERN openlab
Kurth Thorsten: NVIDIA
Prabhat: National Energy Research Scientific Computing Center
Patel Bhavesh: Dell EMC
Lam Herman: SHREC: NSF Center for Space, High-Performance, and Resilient Computing, University of Florida

DOI: https://doi.org/10.1051/epjconf/202024509014
Journal volume & issue: Vol. 245
p. 09014

Abstract

Read online

AI and deep learning are experiencing explosive growth in almost every domain involving analysis of big data. Deep learning using Deep Neural Networks (DNNs) has shown great promise for such scientific data analysis applications. However, traditional CPU-based sequential computing without special instructions can no longer meet the requirements of mission-critical applications, which are compute-intensive and require low latency and high throughput. Heterogeneous computing (HGC), with CPUs integrated with GPUs, FPGAs, and other science-targeted accelerators, offers unique capabilities to accelerate DNNs. Collaborating researchers at SHREC1at the University of Florida, CERN Openlab, NERSC2at Lawrence Berkeley National Lab, Dell EMC, and Intel are studying the application of heterogeneous computing (HGC) to scientific problems using DNN models. This paper focuses on the use of FPGAs to accelerate the inferencing stage of the HGC workflow. We present case studies and results in inferencing state-of-the-art DNN models for scientific data analysis, using Intel distribution of OpenVINO, running on an Intel Programmable Acceleration Card (PAC) equipped with an Arria 10 GX FPGA. Using the Intel Deep Learning Acceleration (DLA) development suite to optimize existing FPGA primitives and develop new ones, we were able accelerate the scientific DNN models under study with a speedup from 2.46x to 9.59x for a single Arria 10 FPGA against a single core (single thread) of a server-class Skylake CPU.

Published in EPJ Web of Conferences

ISSN: 2100-014X (Online)
Publisher: EDP Sciences
Country of publisher: France
LCC subjects: Science: Physics
Website: http://www.epj-conferences.org/

About the journal