Machine Learning in Quasi-Newton Methods

Vladimir Krutikov; Elena Tovbis; Predrag Stanimirović; Lev Kazakovtsev; Darjan Karabašević

doi:10.3390/axioms13040240

Axioms (Apr 2024)

Machine Learning in Quasi-Newton Methods

Vladimir Krutikov,
Elena Tovbis,
Predrag Stanimirović,
Lev Kazakovtsev,
Darjan Karabašević

Affiliations

Vladimir Krutikov: Laboratory “Hybrid Methods of Modeling and Optimization in Complex Systems”, Siberian Federal University, 79 Svobodny Prospekt, 660041 Krasnoyarsk, Russia
Elena Tovbis: Institute of Informatics and Telecommunications, Reshetnev Siberian State University of Science and Technology, 31, Krasnoyarskii Rabochii Prospekt, 660037 Krasnoyarsk, Russia
Predrag Stanimirović: Laboratory “Hybrid Methods of Modeling and Optimization in Complex Systems”, Siberian Federal University, 79 Svobodny Prospekt, 660041 Krasnoyarsk, Russia
Lev Kazakovtsev: Laboratory “Hybrid Methods of Modeling and Optimization in Complex Systems”, Siberian Federal University, 79 Svobodny Prospekt, 660041 Krasnoyarsk, Russia
Darjan Karabašević: College of Global Business, Korea University, Sejong 30019, Republic of Korea

DOI: https://doi.org/10.3390/axioms13040240
Journal volume & issue: Vol. 13, no. 4
p. 240

Abstract

Read online

In this article, we consider the correction of metric matrices in quasi-Newton methods (QNM) from the perspective of machine learning theory. Based on training information for estimating the matrix of the second derivatives of a function, we formulate a quality functional and minimize it by using gradient machine learning algorithms. We demonstrate that this approach leads us to the well-known ways of updating metric matrices used in QNM. The learning algorithm for finding metric matrices performs minimization along a system of directions, the orthogonality of which determines the convergence rate of the learning process. The degree of learning vectors’ orthogonality can be increased both by choosing a QNM and by using additional orthogonalization methods. It has been shown theoretically that the orthogonality degree of learning vectors in the Broyden–Fletcher–Goldfarb–Shanno (BFGS) method is higher than in the Davidon–Fletcher–Powell (DFP) method, which determines the advantage of the BFGS method. In our paper, we discuss some orthogonalization techniques. One of them is to include iterations with orthogonalization or an exact one-dimensional descent. As a result, it is theoretically possible to detect the cumulative effect of reducing the optimization space on quadratic functions. Another way to increase the orthogonality degree of learning vectors at the initial stages of the QNM is a special choice of initial metric matrices. Our computational experiments on problems with a high degree of conditionality have confirmed the stated theoretical assumptions.

Published in Axioms

ISSN: 2075-1680 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/axioms

About the journal

Abstract

Keywords