Hydrology and Earth System Sciences (Jan 2017)
Looking beyond general metrics for model comparison – lessons from an international model intercomparison study
Abstract
International collaboration between research institutes and universities is a promising way to reach consensus on hydrological model development. Although model comparison studies are very valuable for international cooperation, they do often not lead to very clear new insights regarding the relevance of the modelled processes. We hypothesise that this is partly caused by model complexity and the comparison methods used, which focus too much on a good overall performance instead of focusing on a variety of specific events. In this study, we use an approach that focuses on the evaluation of specific events and characteristics. Eight international research groups calibrated their hourly model on the Ourthe catchment in Belgium and carried out a validation in time for the Ourthe catchment and a validation in space for nested and neighbouring catchments. The same protocol was followed for each model and an ensemble of best-performing parameter sets was selected. Although the models showed similar performances based on general metrics (i.e. the Nash–Sutcliffe efficiency), clear differences could be observed for specific events. We analysed the hydrographs of these specific events and conducted three types of statistical analyses on the entire time series: cumulative discharges, empirical extreme value distribution of the peak flows and flow duration curves for low flows. The results illustrate the relevance of including a very quick flow reservoir preceding the root zone storage to model peaks during low flows and including a slow reservoir in parallel with the fast reservoir to model the recession for the studied catchments. This intercomparison enhanced the understanding of the hydrological functioning of the catchment, in particular for low flows, and enabled to identify present knowledge gaps for other parts of the hydrograph. Above all, it helped to evaluate each model against a set of alternative models.