A Scaling Transition Method from SGDM to SGD with 2ExpLR Strategy

Kun Zeng; Jinlan Liu; Zhixia Jiang; Dongpo Xu

doi:10.3390/app122312023

Applied Sciences (Nov 2022)

A Scaling Transition Method from SGDM to SGD with 2ExpLR Strategy

Kun Zeng,
Jinlan Liu,
Zhixia Jiang,
Dongpo Xu

Affiliations

Kun Zeng: School of Mathematics and Statistics, Changchun University of Science and Technology, Changchun 130022, China
Jinlan Liu: School of Mathematics and Statistics, Northeast Normal University, Changchun 130024, China
Zhixia Jiang: School of Mathematics and Statistics, Changchun University of Science and Technology, Changchun 130022, China
Dongpo Xu: School of Mathematics and Statistics, Northeast Normal University, Changchun 130024, China

DOI: https://doi.org/10.3390/app122312023
Journal volume & issue: Vol. 12, no. 23
p. 12023

Abstract

Read online

In deep learning, the vanilla stochastic gradient descent (SGD) and SGD with heavy-ball momentum (SGDM) methods have a wide range of applications due to their simplicity and great generalization. This paper uses an exponential scaling method to realize a smooth and stable transition from SGDM to SGD, which combines the advantages of the fast training speed of SGDM and the accurate convergence of SGD (named TSGD). We also provide some theoretical results on the convergence of this algorithm. At the same time, we take advantage of the learning rate warmup strategy’s stability and the learning rate decay strategy’s high accuracy. A warmup–decay learning rate strategy with double exponential functions is proposed (named 2ExpLR). The experimental results on different datasets for the proposed algorithms indicate that the accuracy is improved significantly and that the training is faster and more stable.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords