Algorithms (Oct 2018)

Bidirectional Grid Long Short-Term Memory (BiGridLSTM): A Method to Address Context-Sensitivity and Vanishing Gradient

  • Hongxiao Fei,
  • Fengyun Tan

DOI
https://doi.org/10.3390/a11110172
Journal volume & issue
Vol. 11, no. 11
p. 172

Abstract

Read online

The Recurrent Neural Network (RNN) utilizes dynamically changing time information through time cycles, so it is very suitable for tasks with time sequence characteristics. However, with the increase of the number of layers, the vanishing gradient occurs in the RNN. The Grid Long Short-Term Memory (GridLSTM) recurrent neural network can alleviate this problem in two dimensions by taking advantage of the two dimensions calculated in time and depth. In addition, the time sequence task is related to the information of the current moment before and after. In this paper, we propose a method that takes into account context-sensitivity and gradient problems, namely the Bidirectional Grid Long Short-Term Memory (BiGridLSTM) recurrent neural network. This model not only takes advantage of the grid architecture, but it also captures information around the current moment. A large number of experiments on the dataset LibriSpeech show that BiGridLSTM is superior to other deep LSTM models and unidirectional LSTM models, and, when compared with GridLSTM, it gets about 26 percent gain improvement.

Keywords