Ecological Informatics (Dec 2025)

River total dissolved gas prediction using a hybrid greedy-stepwise feature selection and bidirectional long short-term memory model

  • Khabat Khosravi,
  • Salim Heddam,
  • Changhyun Jun,
  • Sayed M. Bateni,
  • Dongkyun Kim,
  • Essam Heggy

DOI
https://doi.org/10.1016/j.ecoinf.2025.103191
Journal volume & issue
Vol. 90
p. 103191

Abstract

Read online

The supersaturation of total dissolved gas (TDG) in rivers serves as a critical indicator of water quality downstream of high dams. This study models TDG levels at two monitoring stations in the Columbia and Snake River Basins (USA), where high TDG concentrations were recorded. Hourly data on water temperature, barometric pressure, dam spill, sensor depth, and discharge serve as input variables for deep-learning models. Several models are developed and tested, including long short-term memory (LSTM), bidirectional LSTM (BiLSTM), gated recurrent unit (GRU), and an alternating model tree (AMT) hybridized with iterative absolute error regression (IAER) and iterative classifier optimizer (ICO) algorithms. A greedy stepwise feature selection technique (GSFST) is employed to identify the optimal model inputs. Each model is trained and evaluated at one station and validated at the second station to assess transferability and generalization capability. Model performance was compared using multiple quantitative and qualitative metrics, including the Nash–Sutcliffe Efficiency and uncertainty coefficient. Additionally, Friedman and Wilcoxon signed-rank tests confirmed statistically significant differences between models. Dam spills emerged as the most influential predictor of TDG levels at both sites. The GSFST selected the optimal input combination, including dam spill, water temperature, barometric pressure, and sensor depth. Among all models, GSFST-BiLSTM achieved the highest predictive accuracy, with Nash–Sutcliffe values of 0.95 (testing) and 0.90 (validation) and uncertainty coefficients of 5.2 % and 7.0 %, respectively. These findings demonstrate that GSFST-BiLSTM provides a robust and transferable framework for TDG prediction, with the potential for broader application pending further validation.

Keywords