Uncertainty Quantification in Machine Learning Modeling for Multi-Step Time Series Forecasting: Example of Recurrent Neural Networks in Discharge Simulations

Tianyu Song; Wei Ding; Haixing Liu; Jian Wu; Huicheng Zhou; Jinggang Chu

doi:10.3390/w12030912

Water (Mar 2020)

Uncertainty Quantification in Machine Learning Modeling for Multi-Step Time Series Forecasting: Example of Recurrent Neural Networks in Discharge Simulations

Tianyu Song,
Wei Ding,
Haixing Liu,
Jian Wu,
Huicheng Zhou,
Jinggang Chu

Affiliations

Tianyu Song: School of Hydraulic Engineering, Dalian University of Technology, Dalian 116024, China
Wei Ding: School of Hydraulic Engineering, Dalian University of Technology, Dalian 116024, China
Haixing Liu: School of Hydraulic Engineering, Dalian University of Technology, Dalian 116024, China
Jian Wu: School of Hydraulic Engineering, Dalian University of Technology, Dalian 116024, China
Huicheng Zhou: School of Hydraulic Engineering, Dalian University of Technology, Dalian 116024, China
Jinggang Chu: School of Hydraulic Engineering, Dalian University of Technology, Dalian 116024, China

DOI: https://doi.org/10.3390/w12030912
Journal volume & issue: Vol. 12, no. 3
p. 912

Abstract

Read online

As a revolutionary tool leading to substantial changes across many areas, Machine Learning (ML) techniques have obtained growing attention in the field of hydrology due to their potentials to forecast time series. Moreover, a subfield of ML, Deep Learning (DL) is more concerned with datasets, algorithms and layered structures. Despite numerous applications of novel ML/DL techniques in discharge simulation, the uncertainty involved in ML/DL modeling has not drawn much attention, although it is an important issue. In this study, a framework is proposed to quantify uncertainty contributions of the sample set, ML approach, ML architecture and their interactions to multi-step time-series forecasting based on the analysis of variance (ANOVA) theory. Then a discharge simulation, using Recurrent Neural Networks (RNNs), is taken as an example. Long Short-Term Memory (LSTM) network, a state-of-the-art DL approach, was selected due to its outstanding performance in time-series forecasting, and compared with simple RNN. Besides, novel discharge forecasting architecture is designed by combining the expertise of hydrology and stacked DL structure, and compared with conventional design. Taking hourly discharge simulations of Anhe (China) catchment as a case study, we constructed five sample sets, chose two RNN approaches and designed two ML architectures. The results indicate that none of the investigated uncertainty sources are negligible and the influence of uncertainty sources varies with lead-times and discharges. LSTM demonstrates its superiority in discharge simulations, and the ML architecture is as important as the ML approach. In addition, some of the uncertainty is attributable to interactions rather than individual modeling components. The proposed framework can both reveal uncertainty quantification in ML/DL modeling and provide references for ML approach evaluation and architecture design in discharge simulations. It indicates uncertainty quantification is an indispensable task for a successful application of ML/DL.

Published in Water

ISSN: 2073-4441 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Hydraulic engineering; Technology: Environmental technology. Sanitary engineering: Water supply for domestic and industrial purposes
Website: http://www.mdpi.com/journal/water/

About the journal

Abstract

Keywords