Advances in Engineering and Intelligence Systems (Jun 2025)

Hybrid Machine Learning in Hydrological Runoff Forecasting: An Exploration of Extreme Gradient-Boosting and Categorical Gradient Boosting Optimization in the Russian River Basin

  • Reza Seifi Majdar,
  • Ali Rahnamaei,
  • Vahid Babazadeh

DOI
https://doi.org/10.22034/aeis.2025.509199.1293
Journal volume & issue
Vol. 004, no. 02
pp. 104 – 120

Abstract

Read online

Accurate and reliable runoff forecasts are essential for effective water resource management and flood control operations. Hydrological forecasting plays a key role in decision-making, especially under changing climate conditions. Recent advances in machine learning (ML) have opened new opportunities to improve prediction accuracy. This study focuses on evaluating commonly used ML methods for runoff prediction, with an emphasis on simplicity and comparability to more advanced models. In particular, boosting algorithms such as Extreme Gradient Boosting (XGBoost) and Categorical Boosting (CatBoost) are examined due to their strong performance in previous hydrological studies. These models are known for handling complex, non-linear relationships and offering high accuracy with efficient computation. Their ability to manage missing and categorical data also adds to their practical advantages. The results show that XGBoost and CatBoost provide accurate and robust runoff predictions, making them promising tools for improving hydrological forecasting and supporting better water resource planning and flood risk management. Furthermore, optimization techniques, including GWO, SMA, and PSO, were incorporated to elevate forecasting accuracy. Data were carefully collected and preprocessed from reliable sources, with 80% used for training and 20% for testing. Both individual algorithms and their hybrid counterparts were evaluated, revealing XGBoost's superior performance, notably in its hybrid form with SMA, achieving an R2 value of 0.98227. The study indicates the promise of hybrid models in advancing runoff prediction, yet also emphasizes the need for further refinement in capturing peak values.

Keywords