Journal of Taibah University for Science (Dec 2024)
The Hessian by blocks for neural network by backward propagation
Abstract
The back-propagation algorithm used with a stochastic gradient and the increase in computer performance are at the origin of the recent Deep learning trend. For some problems, however, the convergence of gradient methods is still very slow. Newton's method offers potential advantages in terms of faster convergence. This method uses the Hessian matrix to guide the optimization process but increases the computational cost at each iteration. Indeed, although the expression of the Hessian matrix is explicitly known, previous work did not propose an efficient algorithm for its fast computation. In this work, we first propose a backward algorithm to compute the exact Hessian matrix. In addition, the introduction of original operators, for the calculation of second derivatives, facilitates the reading and allows the parallelization of the backward-looking algorithm. To study the practical performance of Newton's method, we apply the proposed algorithm to train two classical neural networks for regression and classification problems and display the associated numerical results.
Keywords