Unified regularity measures for sample-wise learning and generalization

Chi Zhang; Meng Yuan; Xiaoning Ma; Yu Liu; Haoang Lu; Le Wang; Yuanqi Su; Yuehu Liu

doi:10.1007/s44267-024-00069-4

Visual Intelligence (Dec 2024)

Unified regularity measures for sample-wise learning and generalization

Chi Zhang,
Meng Yuan,
Xiaoning Ma,
Yu Liu,
Haoang Lu,
Le Wang,
Yuanqi Su,
Yuehu Liu

Affiliations

Chi Zhang: Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University
Meng Yuan: Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University
Xiaoning Ma: School of Computer Science and Technology, Xi’an Jiaotong University
Yu Liu: Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University
Haoang Lu: School of Computer Science and Technology, Xi’an Jiaotong University
Le Wang: Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University
Yuanqi Su: School of Computer Science and Technology, Xi’an Jiaotong University
Yuehu Liu: Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University

DOI: https://doi.org/10.1007/s44267-024-00069-4
Journal volume & issue: Vol. 2, no. 1
pp. 1 – 20

Abstract

Read online

Abstract Fundamental machine learning theory shows that different samples contribute unequally to both the learning and testing processes. Recent studies on deep neural networks (DNNs) suggest that such sample differences are rooted in the distribution of intrinsic pattern information, namely sample regularity. Motivated by recent discoveries in network memorization and generalization, we propose a pair of sample regularity measures with a formulation-consistent representation for both processes. Specifically, the cumulative binary training/generalizing loss (CBTL/CBGL), the cumulative number of correct classifications of the training/test sample within the training phase, is proposed to quantify the stability in the memorization-generalization process, while forgetting/mal-generalizing events (ForEvents/MgEvents), i.e., the misclassification of previously learned or generalized samples, are utilized to represent the uncertainty of sample regularity with respect to optimization dynamics. The effectiveness and robustness of the proposed approaches for mini-batch stochastic gradient descent (SGD) optimization are validated through sample-wise analyses. Further training/test sample selection applications show that the proposed measures, which share the unified computing procedure, could benefit both tasks.

Published in Visual Intelligence

ISSN: 2097-3330 (Print); 2731-9008 (Online)
Publisher: Springer
Country of publisher: Singapore
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science; Science: Physiology: Neurophysiology and neuropsychology
Website: https://link.springer.com/journal/44267

About the journal

Abstract

Keywords