Modular Dynamic Neural Network: A Continual Learning Architecture

Daniel Turner; Pedro J. S. Cardoso; João M. F. Rodrigues

doi:10.3390/app112412078

Applied Sciences (Dec 2021)

Modular Dynamic Neural Network: A Continual Learning Architecture

Daniel Turner,
Pedro J. S. Cardoso,
João M. F. Rodrigues

Affiliations

Daniel Turner: LARSYS & ISE, Universidade do Algarve, 8005-139 Faro, Portugal
Pedro J. S. Cardoso: LARSYS & ISE, Universidade do Algarve, 8005-139 Faro, Portugal
João M. F. Rodrigues: LARSYS & ISE, Universidade do Algarve, 8005-139 Faro, Portugal

DOI: https://doi.org/10.3390/app112412078
Journal volume & issue: Vol. 11, no. 24
p. 12078

Abstract

Read online

Learning to recognize a new object after having learned to recognize other objects may be a simple task for a human, but not for machines. The present go-to approaches for teaching a machine to recognize a set of objects are based on the use of deep neural networks (DNN). So, intuitively, the solution for teaching new objects on the fly to a machine should be DNN. The problem is that the trained DNN weights used to classify the initial set of objects are extremely fragile, meaning that any change to those weights can severely damage the capacity to perform the initial recognitions; this phenomenon is known as catastrophic forgetting (CF). This paper presents a new (DNN) continual learning (CL) architecture that can deal with CF, the modular dynamic neural network (MDNN). The presented architecture consists of two main components: (a) the ResNet50-based feature extraction component as the backbone; and (b) the modular dynamic classification component, which consists of multiple sub-networks and progressively builds itself up in a tree-like structure that rearranges itself as it learns over time in such a way that each sub-network can function independently. The main contribution of the paper is a new architecture that is strongly based on its modular dynamic training feature. This modular structure allows for new classes to be added while only altering specific sub-networks in such a way that previously known classes are not forgotten. Tests on the CORe50 dataset showed results above the state of the art for CL architectures.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords