SWAG: A Novel Neural Network Architecture Leveraging Polynomial Activation Functions for Enhanced Deep Learning Efficiency

Saeid Safaei; Zerotti Woods; Khaled Rasheed; Thiab R. Taha; Vahid Safaei; Juan B. Gutierrez; Hamid R. Arabnia

doi:10.1109/ACCESS.2024.3403457

IEEE Access (Jan 2024)

SWAG: A Novel Neural Network Architecture Leveraging Polynomial Activation Functions for Enhanced Deep Learning Efficiency

Saeid Safaei,
Zerotti Woods,
Khaled Rasheed,
Thiab R. Taha,
Vahid Safaei,
Juan B. Gutierrez,
Hamid R. Arabnia

Affiliations

Saeid Safaei: ORCiD; Department of Computer Science, University of Georgia, Athens, GA, USA
Zerotti Woods: ORCiD; Applied Physics Laboratory, Johns Hopkins University, Baltimore, MD, USA
Khaled Rasheed: Department of Computer Science, University of Georgia, Athens, GA, USA
Thiab R. Taha: Department of Computer Science, University of Georgia, Athens, GA, USA
Vahid Safaei: Department of Mechanical Engineering, University of Isfahan, Isfahan, Iran
Juan B. Gutierrez: ORCiD; Department of Mathematics, The University of Texas at San Antonio, San Antonio, TX, USA
Hamid R. Arabnia: ORCiD; Department of Computer Science, University of Georgia, Athens, GA, USA

DOI: https://doi.org/10.1109/ACCESS.2024.3403457
Journal volume & issue: Vol. 12
pp. 73363 – 73375

Abstract

Read online

Deep learning techniques have demonstrated significant capabilities across numerous applications, with deep neural networks (DNNs) showing promising results. However, training these networks efficiently, especially when determining the most suitable nonlinear activation functions, remains a significant challenge. While the ReLU activation function has been widely adopted, other hand-designed functions have been proposed. One such approach is the trainable activation functions. This paper introduces a novel neural network design, the SWAG. In this structure, instead of evolving, activation functions consistently form a polynomial basis. Each hidden layer in this architecture comprises k sub-layers that use polynomial activation functions adjusted by a factorial coefficient, followed by a Concatenate layer and a layer employing a linear activation function. Leveraging the Stone-Weierstrass approximation theorem, we demonstrate that utilizing a diverse set of polynomial activation functions allows neural networks to retain universal approximation capabilities. The SWAG algorithm’s architecture is then presented, where data normalization is emphasized, and a new optimized version of SWAG is proposed, which reduces the computational challenge of managing higher degrees of input. This optimization harnesses the Taylor series method by utilizing lower-degree terms to compute higher-degree terms efficiently. This paper thus contributes an innovative neural network architecture that optimizes polynomial activation functions, promising more efficient and robust deep learning applications.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords