IEEE Access (Jan 2024)
Time Series Classification With Large Language Models via Linguistic Scaffolding
Abstract
Time series classification requires specialized models that can effectively capture temporal structures. Consequently, Large Language Models (LLMs) have emerged as promising candidates due to their proficiency in sequence modeling and semantic reasoning. However, converting time series data into text results in sequences that exceed the maximum token limit, necessitating truncation or the removal of word embeddings for fixed-length time series embeddings. This restriction not only sacrifices semantic reasoning capabilities accessed through natural language but also limits the ability to handle temporal irregularities. To overcome these challenges, we propose the Language-Scaffolded Time Series Transformer (LSTST), which combines linguistic components and time series embeddings to effectively harness LLMs while overcoming dimensional constraints. Our Language Scaffold reformulates time series classification as a contextual question-answering task, with time series embeddings as context, facilitating the LLM to utilize its inherent semantic knowledge. Moreover, the preserved linguistic structure allows a dynamic number of input context embeddings with real-time positional encoding, handling length restrictions and irregularity in the temporal dimension. Through experiments, we show that LSTST achieves state-of-the-art performance on regular time series classification and also handles irregular time series without any model modifications.
Keywords