Symmetry (Nov 2017)

Hierarchical Meta-Learning in Time Series Forecasting for Improved Interference-Less Machine Learning

  • David Afolabi,
  • Sheng-Uei Guan,
  • Ka Lok Man,
  • Prudence W. H. Wong,
  • Xuan Zhao

DOI
https://doi.org/10.3390/sym9110283
Journal volume & issue
Vol. 9, no. 11
p. 283

Abstract

Read online

The importance of an interference-less machine learning scheme in time series prediction is crucial, as an oversight can have a negative cumulative effect, especially when predicting many steps ahead of the currently available data. The on-going research on noise elimination in time series forecasting has led to a successful approach of decomposing the data sequence into component trends to identify noise-inducing information. The empirical mode decomposition method separates the time series/signal into a set of intrinsic mode functions ranging from high to low frequencies, which can be summed up to reconstruct the original data. The usual assumption that random noises are only contained in the high-frequency component has been shown not to be the case, as observed in our previous findings. The results from that experiment reveal that noise can be present in a low frequency component, and this motivates the newly-proposed algorithm. Additionally, to prevent the erosion of periodic trends and patterns within the series, we perform the learning of local and global trends separately in a hierarchical manner which succeeds in detecting and eliminating short/long term noise. The algorithm is tested on four datasets from financial market data and physical science data. The simulation results are compared with the conventional and state-of-the-art approaches for time series machine learning, such as the non-linear autoregressive neural network and the long short-term memory recurrent neural network, respectively. Statistically significant performance gains are recorded when the meta-learning algorithm for noise reduction is used in combination with these artificial neural networks. For time series data which cannot be decomposed into meaningful trends, applying the moving average method to create meta-information for guiding the learning process is still better than the traditional approach. Therefore, this new approach is applicable to the forecasting of time series with a low signal to noise ratio, with a potential to scale adequately in a multi-cluster system due to the parallelized nature of the algorithm.

Keywords