Algorithms (Jun 2024)
Training of Convolutional Neural Networks for Image Classification with Fully Decoupled Extended Kalman Filter
Abstract
First-order algorithms have long dominated the training of deep neural networks, excelling in tasks like image classification and natural language processing. Now there is a compelling opportunity to explore alternatives that could outperform current state-of-the-art results. From the estimation theory, the Extended Kalman Filter (EKF) arose as a viable alternative and has shown advantages over backpropagation methods. Current computational advances offer the opportunity to review algorithms derived from the EKF, almost excluded from the training of convolutional neural networks. This article revisits an approach of the EKF with decoupling and it brings the Fully Decoupled Extended Kalman Filter (FDEKF) for training convolutional neural networks in image classification tasks. The FDEKF is a second-order algorithm with some advantages over the first-order algorithms, so it can lead to faster convergence and higher accuracy, due to a higher probability of finding the global optimum. In this research, experiments are conducted on well-known datasets that include Fashion, Sports, and Handwritten Digits images. The FDEKF shows faster convergence compared to other algorithms such as the popular Adam optimizer, the sKAdam algorithm, and the reduced extended Kalman filter. Finally, motivated by the finding of the highest accuracy of FDEKF with images of natural scenes, we show its effectiveness in another experiment focused on outdoor terrain recognition.
Keywords