Generalization of Neural Networks on Second-Order Hypercomplex Numbers

Stanislav Pavlov; Dmitry Kozlov; Mikhail Bakulin; Aleksandr Zuev; Andrey Latyshev; Alexander Beliaev

doi:10.3390/math11183973

Mathematics (Sep 2023)

Generalization of Neural Networks on Second-Order Hypercomplex Numbers

Stanislav Pavlov,
Dmitry Kozlov,
Mikhail Bakulin,
Aleksandr Zuev,
Andrey Latyshev,
Alexander Beliaev

Affiliations

Stanislav Pavlov: NN AI Team, Huawei Russian Research Institute, St. Maksima Gorkogo, 117, Nizhny Novgorod 603006, Russia
Dmitry Kozlov: NN AI Team, Huawei Russian Research Institute, St. Maksima Gorkogo, 117, Nizhny Novgorod 603006, Russia
Mikhail Bakulin: NN AI Team, Huawei Russian Research Institute, St. Maksima Gorkogo, 117, Nizhny Novgorod 603006, Russia
Aleksandr Zuev: NN AI Team, Huawei Russian Research Institute, St. Maksima Gorkogo, 117, Nizhny Novgorod 603006, Russia
Andrey Latyshev: NN AI Team, Huawei Russian Research Institute, St. Maksima Gorkogo, 117, Nizhny Novgorod 603006, Russia
Alexander Beliaev: NN AI Team, Huawei Russian Research Institute, St. Maksima Gorkogo, 117, Nizhny Novgorod 603006, Russia

DOI: https://doi.org/10.3390/math11183973
Journal volume & issue: Vol. 11, no. 18
p. 3973

Abstract

Read online

The vast majority of existing neural networks operate by rules set within the algebra of real numbers. However, as theoretical understanding of the fundamentals of neural networks and their practical applications grow stronger, new problems arise, which require going beyond such algebra. Various tasks come to light when the original data naturally have complex-valued formats. This situation is encouraging researchers to explore whether neural networks based on complex numbers can provide benefits over the ones limited to real numbers. Multiple recent works have been dedicated to developing the architecture and building blocks of complex-valued neural networks. In this paper, we generalize models by considering other types of hypercomplex numbers of the second order: dual and double numbers. We developed basic operators for these algebras, such as convolution, activation functions, and batch normalization, and rebuilt several real-valued networks to use them with these new algebras. We developed a general methodology for dual and double-valued gradient calculations based on Wirtinger derivatives for complex-valued functions. For classical computer vision (CIFAR-10, CIFAR-100, SVHN) and signal processing (G2Net, MusicNet) classification problems, our benchmarks show that the transition to the hypercomplex domain can be helpful in reaching higher values of metrics, compared to the original real-valued models.

Published in Mathematics

ISSN: 2227-7390 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/mathematics

About the journal

Abstract

Keywords