IEEE Access (Jan 2024)
SWAG: A Novel Neural Network Architecture Leveraging Polynomial Activation Functions for Enhanced Deep Learning Efficiency
Abstract
Deep learning techniques have demonstrated significant capabilities across numerous applications, with deep neural networks (DNNs) showing promising results. However, training these networks efficiently, especially when determining the most suitable nonlinear activation functions, remains a significant challenge. While the ReLU activation function has been widely adopted, other hand-designed functions have been proposed. One such approach is the trainable activation functions. This paper introduces a novel neural network design, the SWAG. In this structure, instead of evolving, activation functions consistently form a polynomial basis. Each hidden layer in this architecture comprises k sub-layers that use polynomial activation functions adjusted by a factorial coefficient, followed by a Concatenate layer and a layer employing a linear activation function. Leveraging the Stone-Weierstrass approximation theorem, we demonstrate that utilizing a diverse set of polynomial activation functions allows neural networks to retain universal approximation capabilities. The SWAG algorithm’s architecture is then presented, where data normalization is emphasized, and a new optimized version of SWAG is proposed, which reduces the computational challenge of managing higher degrees of input. This optimization harnesses the Taylor series method by utilizing lower-degree terms to compute higher-degree terms efficiently. This paper thus contributes an innovative neural network architecture that optimizes polynomial activation functions, promising more efficient and robust deep learning applications.
Keywords