Journal of Hydroinformatics (Nov 2023)
Prediction of multi-sectoral longitudinal water withdrawals using hierarchical machine learning models
Abstract
Accurate models of water withdrawal are crucial in anticipating the potential water use impacts of drought and climate change. Machine learning methods can simulate the complex, nonlinear relationship between water use and potential explanatory factors, but rarely incorporate the hierarchical nature of water use data. This work presents a novel approach for the prediction of water withdrawals across multiple usage sectors using an ensemble of models fit at different hierarchical levels. Models were fit at the facility and sectoral grouping levels, as well as across facility clusters defined by temporal water use characteristics. Using repeated holdout cross-validation and a dataset of over 300,000 observations of monthly water withdrawal across 1,509 facilities, it demonstrates that ensemble predictions led to statistically significant improvements in predictive performance in five of the eight sectors analyzed. The use of ensemble modeling resulted in lower predictive errors compared to facility models in 65% of facilities analyzed. The relative improvement gained by ensemble modeling was greatest for facilities with fewer observations and higher variance, indicating its potential value in predicting withdrawal for facilities with relatively short data records or data quality issues. HIGHLIGHTS Hierarchical ensemble models reduce predictive errors for a majority of facilities analyzed.; Cluster analysis is used to build models for groups of facilities with similar temporal water use behavior.; Ensemble models are most beneficial in facilities with high variance and fewer observations of withdrawal.;
Keywords