LSTM training system based on heterogeneous hardware

Weixin HUANG; Weifang HU; Xuejiao CAO; Xuanhua SHI

大数据 (Jul 2024)

LSTM training system based on heterogeneous hardware

Weixin HUANG,
Weifang HU,
Xuejiao CAO,
Xuanhua SHI

Affiliations

Weixin HUANG
Weifang HU
Xuejiao CAO
Xuanhua SHI

Journal volume & issue: Vol. 10
pp. 172 – 188

Abstract

Read online

In the era of big data, deep neurals network models represented by LSTM have the ability to process massive data, and have excellent performance in the fields of language processing, speech recognition and time series data prediction.However, with the increase of model complexity, the training cost increases significantly.The existing LSTM training systems use acceleration methods, such as operator fusion and multi-stream, but neglect the parallelism of the internal calculation of a single training operator, which leads a low utilization rate of computing resources and a long traning time.Therefore, this paper designs a training acceleration system called TurboLSTM based on fine-grained model partitioning method and multi-stream parallel scheduling strategy.A new underlying training operator built on NVIDIA GPU and domestic Ascend NPU heterogeneous hardware realizes reasonable utilization of computing resources for tasks.Compared with the existing training systems, TurboLSTM on NVIDIA GPU has about 23% speed improvement of a single operator and about 17% speed improvement of the overall training time of a model, while TurboLSTM on Ascend NPU has about 15% speed improvement of a single operator, and the significant increase in the utilization of computing resources is observed.This shows that the acceleration method is efficient and has good generalization ability.

LSTM;training acceleration;fine-grained parallelism;multi-stream scheduling

Published in 大数据

ISSN: 2096-0271 (Print)
Publisher: China InfoCom Media Group
Country of publisher: China
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://www.infocomm-journal.com/bdr/EN/2096-0271/home.shtml

About the journal

Abstract

Keywords