Symmetry (Jun 2024)

Frequency-Enhanced Transformer with Symmetry-Based Lightweight Multi-Representation for Multivariate Time Series Forecasting

  • Chenyue Wang,
  • Zhouyuan Zhang,
  • Xin Wang,
  • Mingyang Liu,
  • Lin Chen,
  • Jiatian Pi

DOI
https://doi.org/10.3390/sym16070797
Journal volume & issue
Vol. 16, no. 7
p. 797

Abstract

Read online

Transformer-based methods have recently demonstrated their potential in time series forecasting problems. However, the mainstream approach, primarily utilizing attention to model inter-step correlation in the time domain, is constrained by two significant issues that lead to ineffective and inefficient multivariate forecasting. The first is that key representations in the time domain are scattered and sparse, resulting in parameter bloat and increased difficulty in capturing time dependencies. The second is that treating time step points as uniformly embedded tokens leads to the erasure of inter-variate correlations. To address these challenges, we propose a frequency-wise and variables-oriented transformer-based method. This method leverages the intrinsic conjugate symmetry in the frequency domain, enabling compact frequency domain representations that naturally mix information across time points while reducing spatio-temporal costs. Multivariate inter-correlations can also be captured from similar frequency domain components, which enhances the variables-oriented attention mechanism modeling capability. Further, we employ both polar and complex domain perspectives to enrich the frequency domain representations and decode complicated temporal patterns. We propose frequency-enhanced independent representation multi-head attention (FIR-Attention) to leverage these advantages for improved multivariate interaction. Techniques such as cutting-off frequency and equivalent mapping are used to ensure the model’s lightweight nature. Extensive experiments on eight mainstream datasets show that our approach achieves first-rate satisfactory results and, importantly, requires only one percent of the spatio-temporal cost of mainstream methods.

Keywords