A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient Convergence

Andrinandrasana David Rasamoelina; Ivan Cík; Peter Sincak; Marián Mach; Lukáš Hruška

doi:10.4114/intartif.vol25iss70pp95-109

Inteligencia Artificial (Dec 2022)

A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient Convergence

Andrinandrasana David Rasamoelina,
Ivan Cík,
Peter Sincak,
Marián Mach,
Lukáš Hruška

Affiliations

Andrinandrasana David Rasamoelina: Dept. of Cybernetics and Artificial Intelligence, FEI TU of Kosice, Slovak Republic
Ivan Cík: Dept. of Cybernetics and Artificial Intelligence, FEI TU of Kosice, Slovak Republic
Peter Sincak: Faculty of Mechanical Engineering and Informatics, University of Miskolc, Hungary
Marián Mach: Dept. of Cybernetics and Artificial Intelligence, FEI TU of Kosice, Slovak Republic
Lukáš Hruška: Dept. of Cybernetics and Artificial Intelligence, FEI TU of Kosice, Slovak Republic

DOI: https://doi.org/10.4114/intartif.vol25iss70pp95-109
Journal volume & issue: Vol. 25, no. 70

Abstract

Read online

Activation functions play an important role in the convergence of learning algorithms based on neural networks. Theyprovide neural networks with nonlinear ability and the possibility to fit in any complex data. However, no deep study exists in theliterature on the comportment of activation functions in modern architecture. Therefore, in this research, we compare the 18 most used activation functions on multiple datasets (CIFAR-10, CIFAR-100, CALTECH-256) using 4 different models (EfficientNet,ResNet, a variation of ResNet using the bag of tricks, and MobileNet V3). Furthermore, we explore the shape of the losslandscape of those different architectures with various activation functions. Lastly, based on the result of our experimentation,we introduce a new locally quadratic activation function namely Hytana alongside one variation Parametric Hytana whichoutperforms common activation functions and address the dying ReLU problem.

Published in Inteligencia Artificial

ISSN: 1137-3601 (Print); 1988-3064 (Online)
Publisher: Asociación Española para la Inteligencia Artificial
Country of publisher: Spain
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://journal.iberamia.org

About the journal

Abstract

Keywords