On the Universally Optimal Activation Function for a Class of Residual Neural Networks

Feng Zhao; Shao-Lun Huang

doi:10.3390/appliedmath2040033

AppliedMath (Oct 2022)

On the Universally Optimal Activation Function for a Class of Residual Neural Networks

Feng Zhao,
Shao-Lun Huang

Affiliations

Feng Zhao: Department of Electronics Engineering, Tsinghua University, Beijing 100089, China
Shao-Lun Huang: Tsinghua-Berkeley Shenzhen Institute, Tsinghua University, Shenzhen 518000, China

DOI: https://doi.org/10.3390/appliedmath2040033
Journal volume & issue: Vol. 2, no. 4
pp. 574 – 584

Abstract

Read online

While non-linear activation functions play vital roles in artificial neural networks, it is generally unclear how the non-linearity can improve the quality of function approximations. In this paper, we present a theoretical framework to rigorously analyze the performance gain of using non-linear activation functions for a class of residual neural networks (ResNets). In particular, we show that when the input features for the ResNet are uniformly chosen and orthogonal to each other, using non-linear activation functions to generate the ResNet output averagely outperforms using linear activation functions, and the performance gain can be explicitly computed. Moreover, we show that when the activation functions are chosen as polynomials with the degree much less than the dimension of the input features, the optimal activation functions can be precisely expressed in the form of Hermite polynomials. This demonstrates the role of Hermite polynomials in function approximations of ResNets.

Published in AppliedMath

ISSN: 2673-9909 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: https://www.mdpi.com/journal/appliedmath

About the journal

Abstract

Keywords