IEEE Access (Jan 2024)

Temporal Multi-Features Representation Learning-Based Clustering for Time-Series Data

  • Jaehoon Lee,
  • Dohee Kim,
  • Sunghyun Sim

DOI
https://doi.org/10.1109/ACCESS.2024.3417348
Journal volume & issue
Vol. 12
pp. 87675 – 87690

Abstract

Read online

Time-series clustering remains a challenge in data mining. Although novel deep-learning-based representation learning integrated with deep clustering methods have considerably enhanced the performance of time-series clustering, efficiently capturing the various temporal patterns inherent in the data is difficult in representation learning for time-series data. In this study, we proposed a novel representation learning method called temporal multi-features representation learning (TMRL) to capture various temporal patterns embedded in time-series data. Based on TMRL, we introduce the temporal multi-features representation clustering (TMRC) framework for performing time-series clustering. The proposed framework decomposes the input time-series data into k temporal patterns and uses k LSTM autoencoders to extract specialized features for each decomposed diverse temporal pattern through TMRL. Variational-mode decomposition is used to extract temporal multi-features. Finally, temporal multi-features derived from TMRL are ensembled for time-series clustering. To evaluate the superiority of the proposed method, comparative experiments were conducted with 36 publicly available time-series datasets against 16 baseline models. In the comparative experiments, we achieved the highest RI and normalized mutual information values in 12 time-series datasets. Particularly, on datasets consisting of eight types of motion- and spectro-types, the proposed method attained the highest RI and NMI values in six datasets. Furthermore, visualization results of the learned features through TMRL demonstrated superior representation learning compared with existing methods. These results indicated that the proposed TMRC framework is highly suitable for the learning representations of time-series data and can be effectively used for time-series clustering.

Keywords