IEEE Access (Jan 2023)

PolyLU: A Simple and Robust Polynomial-Based Linear Unit Activation Function for Deep Learning

  • Han-Shen Feng,
  • Cheng-Hsiung Yang

DOI
https://doi.org/10.1109/ACCESS.2023.3315308
Journal volume & issue
Vol. 11
pp. 101347 – 101358

Abstract

Read online

The activation function has a critical influence on whether a convolutional neural network in deep learning can converge or not; a proper activation function not only makes the convolutional neural network converge faster but also can reduce the complexity of convolutional neural network architecture and gets the same or better performance. Many activation functions have been proposed; however, various activation functions have advantages, defects, and applicable network architectures. A new activation function called Polynomial Linear Unit (PolyLU) is proposed in this paper to improve some of the shortcomings of the existing activation functions. The PolyLU meets the following basic properties: continuously differentiable, approximate identity near the origin, unbounded for positive inputs, bounded for negative inputs, smooth, monotonic, and zero-centered. There is a polynomial term for the negative inputs and no exponential terms in the PolyLU that reduce the computational complexity of the network. Compared to those common activation functions like Sigmoid, Tanh, ReLU, LeakyReLU, ELU, Mish, and Swish, the experiments show that the PolyLU has improved some network complexity and has better accuracy over MNIST, Kaggle Cats and Dogs, CIFAR-10 and CIFAR-100 datasets. Test by the CIFAR-100 dataset with batch normalization, PolyLU improves by 0.62%, 2.82%, 2.44%, 1.33%, 2.08%, and 4.26% of accuracy than ELU, Swish, Mish, Leaky ReLU, ReLU, and Tanh respectively. Test by the CIFAR-100 dataset without batch normalization, PolyLU improves by 1.24%, 4.39%, 2.12%, 5.43%, 15.51%, and 8.10% of accuracy than ELU, Swish, Mish, Leaky ReLU, ReLU, and Tanh respectively.

Keywords