Applied Water Science (Nov 2022)

Investigation of cross-entropy-based streamflow forecasting through an efficient interpretable automated search process

  • K. L. Chong,
  • Y. F. Huang,
  • C. H. Koo,
  • Mohsen Sherif,
  • Ali Najah Ahmed,
  • Ahmed El-Shafie

DOI
https://doi.org/10.1007/s13201-022-01790-5
Journal volume & issue
Vol. 13, no. 1
pp. 1 – 32

Abstract

Read online

Abstract Streamflow forecasting has always been important in water resources management, particularly the peak flow, which often determines the seriousness of the impending flood. However, the highly imbalanced flow distribution often hinders the machine learning algorithm's performance. In this paper, streamflow forecasting was approached through the formulation of two distinct machine learning problems: categorical streamflow forecast and regression streamflow forecast. Due to the distinctive characteristics of these two adopted forms, selecting the correct algorithm for the machine learning problem along with their hyperparameter tuning process is critical to the realization of the desired results. For the distinct streamflow formulated scenarios, three neural network algorithms and their hyperparameter tuning strategy were investigated. The comparative empirical studies had revealed that formulated categorical-based streamflow forecast is a better choice than a regression-based streamflow forecast, regardless of the algorithms used; for instance, the f1-score of 0.7 (categorical based) is obtained compared to the 0.53 (regression based) for the LSTM in scenario 1 (binary). Furthermore, forest-based algorithms were investigated and shown to be superior at forecasting high streamflow fluctuations in situations featuring low-dimensional streamflow input. Besides, encoding the streamflow time series as images (input) for forecasting purposes would require a thorough analysis as there is a discrepancy in the results, revealing that not all approaches are suitable for streamflow image transformation. The functional ANOVA analysis provided evidence to substantiate the Bayesian optimization results, implying that the hyperparameters were effectively optimized.

Keywords