International Journal of Electrical Power & Energy Systems (Oct 2024)

Augmenting time series data: An interpretable approach with metric learning and variational autoencoders

  • Chunfeng Zhang,
  • Hao Qin,
  • Yongjun Zhang,
  • Chongying Jiang,
  • Di Zhang,
  • Wenyang Deng

Journal volume & issue
Vol. 161
p. 110190

Abstract

Read online

In the field of time series classification, deep learning techniques have shown remarkable performance; however, their effectiveness is often compromised when confronted with challenges of insufficient data and class imbalance. To address this challenge, we propose an interpretable time series data augmentation algorithm integrating variational autoencoders (VAE) and metric learning. The core contribution of this algorithm is manifested in three aspects: First, it eliminates the heteroscedasticity and non-stationarity of the data, ensuring that the data satisfies the hypothesis of normal distribution in the potential space of the encoder, and effectively avoids the approximation error of the real data distribution; Secondly, the algorithm constructs a discriminant VAE potential space, suitable for data augmentation, with metric learning, ensuring that the hidden variable distribution accurately reflects the characteristics of the original data. Finally, this paper explores the multi-seasonal decomposition algorithm of time series to seamlessly integrate the structural features of the original time series in the generated data, thereby enhancing the interpretability of data generation. Through experimental verification on four multivariate time series data sets, including the electrical energy data set, the results demonstrate that the proposed algorithm outperforms existing methods in fidelity and prediction performance, exhibiting high stability and generalization ability, particularly in cases of limited data volume. The introduction of this algorithm not only contributes to enhancing the overall performance of time series classification models but also substantially reduces the cost of data collection and labeling, thereby demonstrating its significant value in practical applications.

Keywords