PID controller‐based adaptive gradient optimizer for deep neural networks

Mingjun Dai; Zelong Zhang; Xiong Lai; Xiaohui Lin; Hui Wang

doi:10.1049/cth2.12404

IET Control Theory & Applications (Oct 2023)

PID controller‐based adaptive gradient optimizer for deep neural networks

Mingjun Dai,
Zelong Zhang,
Xiong Lai,
Xiaohui Lin,
Hui Wang

Affiliations

Mingjun Dai: Guangdong Provincial Engineering Center for Ubiquitous Computing and Intelligent Networking Department of Communication, College of Electronic and Information Engineering Shenzhen University Shenzhen 518060 China
Zelong Zhang: Guangdong Provincial Engineering Center for Ubiquitous Computing and Intelligent Networking Department of Communication, College of Electronic and Information Engineering Shenzhen University Shenzhen 518060 China
Xiong Lai: Guangdong Provincial Engineering Center for Ubiquitous Computing and Intelligent Networking Department of Communication, College of Electronic and Information Engineering Shenzhen University Shenzhen 518060 China
Xiaohui Lin: Guangdong Provincial Engineering Center for Ubiquitous Computing and Intelligent Networking Department of Communication, College of Electronic and Information Engineering Shenzhen University Shenzhen 518060 China
Hui Wang: Department of CommunicationShenzhen Institute of Information Technology Shenzhen China

DOI: https://doi.org/10.1049/cth2.12404
Journal volume & issue: Vol. 17, no. 15
pp. 2032 – 2037

Abstract

Read online

Abstract Due to improper selection of gradient update direction or learning rate, SGD optimization algorithms for deep learning suffer from oscillation and slow convergence. Although Adam algorithm can adaptively adjust the update direction and learning rate at the same time, it still has the overshoot phenomenon, and hence suffers from wasting computing resources and slow convergence. In this work, the PID controller from the feedback control area is borrowed to re‐express the adaptive optimization algorithm (the Adam optimization algorithm is derived into the integral I component form) of deep learning. In order to alleviate the overshoot phenomenon and hence speed up the convergence of Adam, a complete adaptive PID optimizer (adaptive‐PID) is proposed by incorporating the proportional P and derivative D component. Extensive experiments on standard data sets verify that the proposed adaptive‐PID algorithm significantly outperforms Adam algorithm in terms of convergence rate and accuracy.

Published in IET Control Theory & Applications

ISSN: 1751-8644 (Print); 1751-8652 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Technology: Mechanical engineering and machinery: Control engineering systems. Automatic machinery (General)
Website: https://ietresearch.onlinelibrary.wiley.com/journal/17518652

About the journal

Abstract

Keywords