The performance of the LSTM-based code generated by Large Language Models (LLMs) in forecasting time series data

Saroj Gopali; Sima Siami-Namini; Faranak Abri; Akbar Siami Namin

Natural Language Processing Journal (Dec 2024)

The performance of the LSTM-based code generated by Large Language Models (LLMs) in forecasting time series data

Saroj Gopali,
Sima Siami-Namini,
Faranak Abri,
Akbar Siami Namin

Affiliations

Saroj Gopali: Department of Computer Science, Texas Tech University, USA; Corresponding author.
Sima Siami-Namini: Advanced Academic Programs, Johns Hopkins University, USA
Faranak Abri: Department of Computer Science, San Jose State University, USA
Akbar Siami Namin: Department of Computer Science, Texas Tech University, USA

Journal volume & issue: Vol. 9
p. 100120

Abstract

Read online

Generative AI, and in particular Large Language Models (LLMs), have gained substantial momentum due to their wide applications in various disciplines. While the use of these game changing technologies in generating textual information has already been demonstrated in several application domains, their abilities in generating complex models and executable codes need to be explored. As an intriguing case is the goodness of the machine and deep learning models generated by these LLMs in conducting automated scientific data analysis, where a data analyst may not have enough expertise in manually coding and optimizing complex deep learning models and codes and thus may opt to leverage LLMs to generate the required models. This paper investigates and compares the performance of the mainstream LLMs, such as ChatGPT, PaLM, LLama, and Falcon, in generating deep learning models for analyzing time series data, an important and popular data type with its prevalent applications in many application domains including financial and stock market. This research conducts a set of controlled experiments where the prompts for generating deep learning-based models are controlled with respect to sensitivity levels of four criteria including (1) Clarify and Specificity, (2) Objective and Intent, (3) Contextual Information, and (4) Format and Style. While the results are relatively mix, we observe some distinct patterns. We notice that using LLMs, we are able to generate deep learning-based models with executable codes for each dataset separately whose performance are comparable with the manually crafted and optimized LSTM models for predicting the whole time series dataset. We also noticed that ChatGPT outperforms the other LLMs in generating more accurate models. Furthermore, we observed that the goodness of the generated models vary with respect to the “temperature” parameter used in configuring LLMS. The results can be beneficial for data analysts and practitioners who would like to leverage generative AIs to produce good prediction models with acceptable goodness.

Published in Natural Language Processing Journal

ISSN: 2949-7191 (Online)
Publisher: Elsevier
Country of publisher: Netherlands
LCC subjects: Language and Literature: Philology. Linguistics: Computational linguistics. Natural language processing
Website: https://www.sciencedirect.com/journal/natural-language-processing-journal

About the journal

Abstract

Keywords