ORiON (Jun 2009)

Conjugate descent formulation of backpropagation error in feedforward neural networks

  • NK Sharma,
  • S Kumar,
  • MP Singh

DOI
https://doi.org/10.5784/25-1-73
Journal volume & issue
Vol. 25, no. 1

Abstract

Read online

The feedforward neural network architecture uses backpropagation learning to determine optimal weights between different interconnected layers. This learning procedure uses a gradient descent technique applied to a sum-of-squares error function for the given input-output pattern. It employs an iterative procedure to minimise the error function for a given set of patterns, by adjusting the weights of the network. The first derivates of the error with respect to the weights identify the local error surface in the descent direction. Hence the network exhibits a different local error surface for every different pattern presented to it, and weights are iteratively modified in order to minimise the current local error. The determination of an optimal weight vector is possible only when the total minimum error (mean of the minimum local errors) for all patterns from the training set may be minimised. In this paper, we present a general mathematical formulation for the second derivative of the error function with respect to the weights (which represents a conjugate descent) for arbitrary feedforward neural network topologies, and we use this derivative information to obtain the optimal weight vector. The local error is backpropagated among the units of hidden layers via the second order derivative of the error with respect to the weights of the hidden and output layers independently and also in combination. The new total minimum error point may be evaluated with the help of the current total minimum error and the current minimised local error. The weight modification processes is performed twice: once with respect to the present local error and once more with respect to the current total or mean error. We present some numerical evidence that our proposed method yields better network weights than those determined via a conventional gradient descent approach.