IEEE Access (Jan 2022)
Improving the Forecasting and Classification of Extreme Events in Imbalanced Time Series Through Block Resampling in the Joint Predictor-Forecast Space
Abstract
A novel resampling strategy is introduced to improve the forecasting and classification accuracies of events in imbalanced time series (ITS) containing a mix of low probability extreme observations and high probability normal observations. The lag-based strategy mitigates the imbalance problem by modelling an ITS as a composition of normal and extreme observations, combining the input predictor variables and the associated forecast output into moving blocks, categorizing the blocks as extreme event (EE) or normal event (NE) blocks, and selectively resampling the blocks. Combining the predictor variables and the associated forecast enables resampling of the input and output simultaneously in the joint predictor-forecast (PF)-space. Imbalance is decreased by oversampling the minority EE blocks and undersampling the majority NE blocks. The EE blocks are oversampled using a modification of block bootstrapping and a modification of the synthetic minority oversampling technique. The Box-Cox transform is employed to decrease the pattern complexity caused by the mixing of disparate extreme and normal observations in the ITS. Convolution neural networks and long-short term memory deep neural networks (DNNs) are selected for forecast modelling and tested on a set of simulated and real sub-basin outflow ITS. The root mean square errors, forecast plots, and classification accuracies show that the hybrid forecasting and classification DNN models trained on the block-balanced training sets extracted from the Box-Cox transformed ITS dramatically outperform the corresponding baseline models which are trained directly with the ITS.
Keywords