Implicit Bias of Deep Learning in the Large Learning Rate Phase: A Data Separability Perspective

Chunrui Liu; Wei Huang; Richard Yi Da Xu

doi:10.3390/app13063961

Applied Sciences (Mar 2023)

Implicit Bias of Deep Learning in the Large Learning Rate Phase: A Data Separability Perspective

Chunrui Liu,
Wei Huang,
Richard Yi Da Xu

Affiliations

Chunrui Liu: School of Computer Science, Faculty of Engineering and IT, University of Technology Sydney, Ultimo, NSW 2007, Australia
Wei Huang: RIKEN Center for Advanced Intelligence Project (AIP), 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan
Richard Yi Da Xu: Department of Mathematics, Hong Kong Baptist University, Kowloon Tong, Hong Kong

DOI: https://doi.org/10.3390/app13063961
Journal volume & issue: Vol. 13, no. 6
p. 3961

Abstract

Read online

Previous literature on deep learning theory has focused on implicit bias with small learning rates. In this work, we explore the impact of data separability on the implicit bias of deep learning algorithms under the large learning rate. Using deep linear networks for binary classification with the logistic loss under the large learning rate regime, we characterize the implicit bias effect with data separability on training dynamics. From a data analytics perspective, we claim that depending on the separation conditions of data, the gradient descent iterates will converge to a flatter minimum in the large learning rate phase, which results in improved generalization. Our theory is rigorously proven under the assumption of degenerate data by overcoming the difficulty of the non-constant Hessian of logistic loss and confirmed by experiments on both experimental and non-degenerated datasets. Our results highlight the importance of data separability in training dynamics and the benefits of learning rate annealing schemes using an initial large learning rate.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords