IEEE Access (Jan 2019)
Recurrent Neural Networks With Finite Memory Length
Abstract
The working of recurrent neural networks has not been well understood to date. The construction of such network models, hence, largely relies on heuristics and intuition. This paper formalizes the notion of “memory length” for recurrent networks and consequently discovers a generic family of recurrent networks having maximal memory lengths. Stacking such networks into multiple layers is shown to result in powerful models, including the gated convolutional networks. We show that the structure of such networks potentially enables a more principled design approach in practice and entails no gradient vanishing or exploding during back-propagation. We also present a new example in this family, termed attentive activation recurrent unit (AARU). Experimentally we demonstrate that the performance of this network family, particularly AARU, is superior to the LSTM and GRU networks.
Keywords