Journal of Hydroinformatics (May 2024)
A novel machine learning-based framework for the water quality parameters prediction using hybrid long short-term memory and locally weighted scatterplot smoothing methods
Abstract
Water quality prediction is crucial for effective river stream management. Dissolved oxygen, conductivity and chemical oxygen demand are vital chemical parameters for water quality. Development of machine learning (ML) and deep learning (DL) methods made them widely used in this domain. Sophisticated DL techniques, especially long short-term memory (LSTM) networks, are required for accurate, real-time multistep prediction. LSTM networks are effective in predicting water quality due to their ability to handle long-term dependencies in sequential data. We propose a novel hybrid approach for water quality parameters prediction combining DL with data smoothing method. The Sava river at the Jamena hydrological station serves as a case study. Our workflow uses LSTM networks alongside LOcally WEighted Scatterplot Smoothing (LOWESS) technique for data filtering. For comparison, Support Vector Regressor (SVR) is used as the baseline method. Performance is evaluated using Root Mean Squared Error (RMSE) and Coefficient of Determination R2 metrics. Results demonstrate that LSTM outperforms the baseline method, with an R2 up to 0.9998 and RMSE of 0.0230 on the test set for dissolved oxygen. Over a 5-day prediction period, our approach achieves R2 of 0.9912 and RMSE of 0.1610 confirming it as a reliable method for water quality multistep parameters prediction. HIGHLIGHTS Water quality prediction using hybrid machine learning-based approach.; Time series data forecasting using long short-term memory networks.; Data denoising using the LOcally WEighted Scatterplot Smoothing method.; Case study focused on the Sava River at the Jamena station.; The new hybrid approach outperformed the baseline prediction method.;
Keywords