Mathematics Interdisciplinary Research (Sep 2024)

Modified‎ ‎Step‎ ‎Size‎ ‎for‎ ‎Enhanced‎ ‎Stochastic Gradient Descent‎: ‎Convergence and Experiments

  • Mahsa Soheil shamaee,
  • Sajad Fathi Hafshejani

DOI
https://doi.org/10.22052/mir.2023.253279.1426
Journal volume & issue
Vol. 9, no. 3
pp. 237 – 253

Abstract

Read online

‎This paper introduces a novel approach to enhance the performance of the stochastic gradient descent (SGD) algorithm by incorporating a modified decay step size based on $\frac{1}{\sqrt{t}}$‎. ‎The proposed step size integrates a logarithmic term‎, ‎leading to the selection of smaller values in the final iterations‎. ‎Our analysis establishes a convergence rate of $O(\frac{\ln T}{\sqrt{T}})$ for smooth non-convex functions without the Polyak-Łojasiewicz condition‎. ‎To evaluate the effectiveness of our approach‎, ‎we conducted numerical experiments on image classification tasks using the Fashion-MNIST and CIFAR10 datasets‎, ‎and the results demonstrate significant improvements in accuracy‎, ‎with enhancements of $0.5\%$ and $1.4\%$ observed‎, ‎respectively‎, ‎compared to the traditional $\frac{1}{\sqrt{t}}$ step size‎. ‎The source code can be found at https://github.com/Shamaeem/LNSQRTStepSize.

Keywords