Efficient Scheduling in Training Deep Convolutional Networks at Large Scale

Can Que; Xinming Zhang

doi:10.1109/ACCESS.2018.2875407

IEEE Access (Jan 2018)

Efficient Scheduling in Training Deep Convolutional Networks at Large Scale

Can Que,
Xinming Zhang

Affiliations

Can Que: ORCiD; School of Computer Science and Technology, University of Science and Technology of China, Hefei, China
Xinming Zhang: School of Computer Science and Technology, University of Science and Technology of China, Hefei, China

DOI: https://doi.org/10.1109/ACCESS.2018.2875407
Journal volume & issue: Vol. 6
pp. 61452 – 61456

Abstract

Read online

The deep convolutional network is one of the most successful machine learning models in recent years. However, training large deep networks is a time consuming process. Due to a large number of parameters in these networks, the efficiency of data parallel methods is usually limited by the communication speed of networks. In this paper, we introduce two new algorithms to speedup training large deep networks with multiple machines: (1) propose a new scheduling algorithm to reduce communication delay in gradient transmission and (2) present a new collective algorithm based on reverse-reduce tree to reduce link contentions. We implement our algorithms on a well-known library Caffe and obtain near linearly scaling performance on commodity Ethernet networks.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords