Journal of King Saud University: Computer and Information Sciences (Jun 2022)

Towards activation function search for long short-term model network: A differential evolution based approach

  • Vijayaprabakaran K.,
  • Sathiyamurthy K.

Journal volume & issue
Vol. 34, no. 6
pp. 2637 – 2650

Abstract

Read online

In Deep Neural Networks (DNNs), several architectures had been proposed for the various complex tasks such as Machine Translation, Natural Language processing and time series forecasting. Long-Short Term Model (LSTM), a deep neural network became the popular architecture for solving sequential and time series problems and achieved markable results. On building the LSTM model, many hyper-parameters like activation function, loss function, and optimizer need to be set in advance. These hyper-parameters play a significant role in the performance of the DNNs. This work concentrates on finding a novel activation function that can replace the existing activation function such as sigmoid and tanh in the LSTM. The Differential Evolution Algorithm (DEA) based search methodology is proposed in our work to discover the novel activation function for the LSTM network. Our proposed methodology finds an optimal activation function that outperforms than the traditional activation functions like sigmoid (σ), hyperbolic tangent (tanh) and Rectified Linear Unit (ReLU). In this work, the newly explored activation function based on DEA methodology is sinh(x)+sinh-1(x) named as Combined Hyperbolic Sine (comb-H-sine) function. The proposed comb-H-sine activation function outperforms the traditional functions in LSTM with accuracy of 98.83%,93.49% and 78.38% with MNIST, IMDB and UCI HAR datasets respectively.

Keywords