Applied Water Science (Jun 2022)
Can sampling techniques improve the performance of decomposition-based hydrological prediction models? Exploration of some comparative experiments
Abstract
Abstract The development of sequence decomposition techniques in recent years has facilitated the wide use of decomposition-based prediction models in hydrological forecasting. However, decomposition-based prediction models usually use the overall decomposition (OD) sampling technique to extract samples. Some studies have shown that the OD sampling technique causes abnormally “high” performance of models owing to the utilization of future information, and this technique cannot be applied in practice. Several researchers have also proposed novel sampling techniques, such as semi-stepwise decomposition (SSD), fully stepwise decomposition (FSD), and single-model SSD (SMSSD). Moreover, an improved single-model FSD (SMFSD) sampling technique is proposed in this study. Four decomposition methods, namely discrete wavelet transform (DWT), empirical mode decomposition (EMD), complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN), and variational mode decomposition (VMD), are introduced in this study. A systematic investigation of the models developed using OD sampling techniques is conducted, and the applicability of SSD, FSD, SMSSD, and SMFSD sampling techniques is reasonably evaluated. The application of monthly runoff prediction using the five sampling techniques and four decomposition methods at five representative hydrological stations in Poyang Lake, China, shows that (1) EMD and CEEMDAN (including the improved EMD-based adaptive decomposition method) cannot be used to construct stepwise decomposition prediction models because the implementation of the stepwise decomposition strategy leads to a variable number of sub-series. (2) OD sampling techniques cannot develop convincing models for practical prediction because future information is introduced into the samples for model training. (3) Models developed based on SSD and SMSSD sampling techniques do not use future information in the training process, but suffer from severe overfitting and inferior prediction performance. (4) Models developed based on FSD and SMFSD sampling techniques can produce convincing prediction results, and the combination of the proposed SMFSD sampling technique and VMD develops prediction models with superior performance and significantly enhances the efficiency of the models.
Keywords