Off-Chip Memory Allocation for Neural Processing Units

Andrey Kvochko; Evgenii Maltsev; Artem Balyshev; Stanislav Malakhov; Alexander Efimov

doi:10.1109/ACCESS.2024.3352900

IEEE Access (Jan 2024)

Off-Chip Memory Allocation for Neural Processing Units

Andrey Kvochko,
Evgenii Maltsev,
Artem Balyshev,
Stanislav Malakhov,
Alexander Efimov

Affiliations

Andrey Kvochko: Advanced System Software Laboratory, Samsung Research, Moscow, Russia
Evgenii Maltsev: ORCiD; Advanced System Software Laboratory, Samsung Research, Moscow, Russia
Artem Balyshev: Advanced System Software Laboratory, Samsung Research, Moscow, Russia
Stanislav Malakhov: Advanced System Software Laboratory, Samsung Research, Moscow, Russia
Alexander Efimov: ORCiD; Advanced System Software Laboratory, Samsung Research, Moscow, Russia

DOI: https://doi.org/10.1109/ACCESS.2024.3352900
Journal volume & issue: Vol. 12
pp. 9931 – 9939

Abstract

Read online

Many modern Systems-on-Chip (SoCs) are equipped with specialized Machine Learning (ML) accelerators that use both on-chip and off-chip memory to execute neural networks. While on-chip memory usually has a hard limit, off-chip memory is often considered large enough to hold the network’s inputs, outputs, weights, and any intermediate results that may occur during model execution. This assumption may not hold for edge devices, such as smartphones, which usually have a limit on the amount of memory a process can use. In this study, we propose a novel approach for minimizing a neural network’s off-chip memory usage by introducing a tile-aware allocator capable of reusing memory occupied by parts of a tensor before the entire tensor expires. We describe the necessary conditions for such an off-chip memory allocation approach and provide the results, showing that it can save up to 33% of the peak off-chip memory usage in some common network architectures.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords