Bulletin of the Polish Academy of Sciences: Technical Sciences (Oct 2023)
A hybrid model of heuristic algorithm and gradient descent to optimize neural networks
Abstract
Training a neural network can be a challenging task, particularly when working with complex models and large amounts of training data, as it consumes significant time and resources. This research proposes a hybrid model that combines population-based heuristic algorithms with traditional gradient-based techniques to enhance the training process. The proposed approach involves using a dynamic population-based heuristic algorithm to identify good initial values for the neural network weight vector. This is done as an alternative to the traditional technique of starting with random weights. After several cycles of distributing search agents across the search domain, the training process continues using a gradient-based technique that starts with the best initial weight vector identified by the heuristic algorithm. Experimental analysis confirms that exploring the search domain during the training process decreases the number of cycles needed for gradient descent to train a neural network. Furthermore, a dynamic population strategy is applied during the heuristic search, with objects added and removed dynamically based on their progress. This approach yields better results compared to traditional heuristic algorithms that use the same population members throughout the search process.
Keywords