IEEE Access (Jan 2024)

Prox-STA-LSTM: A Sparse Representation for the Attention-Based LSTM Networks for Industrial Soft Sensor Development

  • Yurun Wang,
  • Yi Huang,
  • Dongsheng Chen,
  • Longyan Wang,
  • Lingjian Ye,
  • Feifan Shen

DOI
https://doi.org/10.1109/ACCESS.2024.3409899
Journal volume & issue
Vol. 12
pp. 80633 – 80645

Abstract

Read online

For deep learning based soft sensors, the spatiotemporal attention (STA)-LSTM is a newly emerged technique which provides efficient predictions for quality variables of industrial processes. However, the STA-LSTM methods calls for an enormous network structure, which contains redundant network weights and therefore diminishing the model generalization ability. In this paper, we consider model sparse representation for the STA-LSTM to cope with the above problem. The $\ell _{1}$ -regularization, which is a popular means to promote sparsity, is introduced into the loss function of the STA-LSTM. The $\ell _{1}$ -regularized formulation is a non-smooth optimization problem, which cannot be well solved by common gradient descent approaches. We deploy the proximal operator, a well principled mathematical tool for handling non-smooth optimization problems, to solve the $\ell _{1}$ -regularized STA-LSTM formulation. The new algorithm is developed within the framework of the state-of-art Adam algorithm, and the sparse representation for the STA-LSTM is referred to as Prox-STA-LSTM. Finally, two industrial cases, a carbon absorber and a desulfurization process, are investigated applying the new soft sensor. The results show that Prox-STA-LSTM can successfully sparsify the STA-LSTM networks. More importantly, the prediction performances are also enhanced.

Keywords