Reducing computational costs in deep learning on almost linearly separable training data

Ilona Kulikovskikh

doi:10.18287/2412-6179-CO-645

Компьютерная оптика (Apr 2020)

Reducing computational costs in deep learning on almost linearly separable training data

Ilona Kulikovskikh

Affiliations

Ilona Kulikovskikh: Samara National Research University, 443086, Russia, Samara, Moskovskoe Shosse 34; Faculty of Electrical Engineering and Computing, University of Zagreb, 10000, Croatia, Zagreb, Unska 3; Rudjer Boskovic Institute, 10000, Croatia, Zagreb, Bijenicka cesta 54

DOI: https://doi.org/10.18287/2412-6179-CO-645
Journal volume & issue: Vol. 44, no. 2
pp. 282 – 289

Abstract

Read online

Previous research in deep learning indicates that iterations of the gradient descent, over separable data converge toward the L2 maximum margin solution. Even in the absence of explicit regularization, the decision boundary still changes even if the classification error on training is equal to zero. This feature of the so-called “implicit regularization” allows gradient methods to use more aggressive learning rates that result in substantial computational savings. However, even if the gradient descent method generalizes well, going toward the optimal solution, the rate of convergence to this solution is much slower than the rate of convergence of a loss function itself with a fixed step size. The present study puts forward the generalized logistic loss function that involves the optimization of hyperparameters, which results in a faster convergence rate while keeping the same regret bound as the gradient descent method. The results of computational experiments on MNIST and Fashion MNIST benchmark datasets for image classification proved the viability of the proposed approach to reducing computational costs and outlined directions for future research.

Published in Компьютерная оптика

ISSN: 0134-2452 (Print); 2412-6179 (Online)
Publisher: Samara National Research University
Country of publisher: Russian Federation
LCC subjects: Science: Science (General): Cybernetics: Information theory; Science: Physics: Optics. Light
Website: http://computeroptics.ru/eng/index.html

About the journal

Abstract

Keywords