Demand MemCpy: Overlapping of Computation and Data Transfer for Heterogeneous Computing

Donghun Jeong; Jihun Park; Jungrae Kim

doi:10.1109/ACCESS.2022.3195271

IEEE Access (Jan 2022)

Demand MemCpy: Overlapping of Computation and Data Transfer for Heterogeneous Computing

Donghun Jeong,
Jihun Park,
Jungrae Kim

Affiliations

Donghun Jeong: ORCiD; Department of Electrical and Computer Engineering, Sungkyunkwan University, Suwon-si, South Korea
Jihun Park: ORCiD; Department of Artificial Intelligence, Sungkyunkwan University, Suwon-si, South Korea
Jungrae Kim: ORCiD; Department of Semiconductor Systems Engineering, Sungkyunkwan University, Suwon-si, South Korea

DOI: https://doi.org/10.1109/ACCESS.2022.3195271
Journal volume & issue: Vol. 10
pp. 79925 – 79938

Abstract

Read online

Heterogeneous computing relies on collaboration among different types of processors on shared data. In systems with discrete accelerators (e.g., GP-GPU), data sharing requires transferring a large amount of data between CPU and accelerator memories and can significantly increase the end-to-end execution time. This paper proposes a novel mechanism called Demand MemCpy (DMC) to hide the data sharing overheads. DMC copies data from host memory to accelerator memory based on demands at page granularity. It utilizes a hardware-only mechanism to fetch the requested page with a short latency and the background pre-copy to fetch related pages in advance. Our evaluation shows that DMC can reduce the end-to-end execution time of GP-GPU application by 25.4% on average by overlapping computation with data transfer and not transferring unused pages.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords