Complexity (Jan 2020)
Training and Testing Data Division Influence on Hybrid Machine Learning Model Process: Application of River Flow Forecasting
Abstract
The hydrological process has a dynamic nature characterised by randomness and complex phenomena. The application of machine learning (ML) models in forecasting river flow has grown rapidly. This is owing to their capacity to simulate the complex phenomena associated with hydrological and environmental processes. Four different ML models were developed for river flow forecasting located in semiarid region, Iraq. The effectiveness of data division influence on the ML models process was investigated. Three data division modeling scenarios were inspected including 70%–30%, 80%–20, and 90%–10%. Several statistical indicators are computed to verify the performance of the models. The results revealed the potential of the hybridized support vector regression model with a genetic algorithm (SVR-GA) over the other ML forecasting models for monthly river flow forecasting using 90%–10% data division. In addition, it was found to improve the accuracy in forecasting high flow events. The unique architecture of developed SVR-GA due to the ability of the GA optimizer to tune the internal parameters of the SVR model provides a robust learning process. This has made it more efficient in forecasting stochastic river flow behaviour compared to the other developed hybrid models.