NEWLSTM: An Optimized Long Short-Term Memory Language Model for Sequence Prediction

Qing Wang; Rong-Qun Peng; Jia-Qiang Wang; Zhi Li; Han-Bing Qu

doi:10.1109/ACCESS.2020.2985418

IEEE Access (Jan 2020)

NEWLSTM: An Optimized Long Short-Term Memory Language Model for Sequence Prediction

Qing Wang,
Rong-Qun Peng,
Jia-Qiang Wang,
Zhi Li,
Han-Bing Qu

Affiliations

Qing Wang: ORCiD; School of Computer Science and Technology, Shandong University of Technology, Zibo, China
Rong-Qun Peng: School of Computer Science and Technology, Shandong University of Technology, Zibo, China
Jia-Qiang Wang: Key Laboratory of Artificial Intelligence and Data Analysis, Beijing Academy of Science and Technology, Beijing, China
Zhi Li: School of Economics and Management, University of Chinese Academy of Sciences, Beijing, China
Han-Bing Qu: Key Laboratory of Artificial Intelligence and Data Analysis, Beijing Academy of Science and Technology, Beijing, China

DOI: https://doi.org/10.1109/ACCESS.2020.2985418
Journal volume & issue: Vol. 8
pp. 65395 – 65401

Abstract

Read online

The long short-term memory (LSTM) model trained on the universal language modeling task overcomes the bottleneck of vanishing gradients in the traditional recurrent neural network (RNN) and shows excellent performance in processing multiple tasks generated by natural language processing. Although LSTM effectively alleviates the vanishing gradient problem in the RNN, the information will be greatly lost in the long distance transmission, and there are still some limitations in its practical use. In this paper, we propose a new model called NEWLSTM, which improves the LSTM model, and alleviates the defects of too many parameters in LSTM and the vanishing gradient. The NEWLSTM model directly correlates the cell state information with current information. The traditional LSTM's input gate and forget gate are integrated, some components are deleted, the problems of too many LSTM parameters and complicated calculations are solved, and the iteration time is effectively reduced. In this paper, a neural network model is used to identify the relationship between input information sequences to predict the language sequence. The experimental results show that the improved new model is simpler than traditional LSTM models and LSTM variants on multiple test sets. NEWLSTM has better overall stability and can better solve the sparse words problem.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords