Crude oil price forecasting using K-means clustering and LSTM model enhanced by dense-sparse-dense strategy

Alireza Jahandoost; Farhad Abedinzadeh Torghabeh; Seyyed Abed Hosseini; Mahboobeh Houshmand

doi:10.1186/s40537-024-00977-8

Journal of Big Data (Aug 2024)

Crude oil price forecasting using K-means clustering and LSTM model enhanced by dense-sparse-dense strategy

Alireza Jahandoost,
Farhad Abedinzadeh Torghabeh,
Seyyed Abed Hosseini,
Mahboobeh Houshmand

Affiliations

Alireza Jahandoost: Department of Computer Engineering, Mashhad Branch, Islamic Azad University
Farhad Abedinzadeh Torghabeh: Department of Biomedical Engineering, Mashhad Branch, Islamic Azad University
Seyyed Abed Hosseini: Department of Electrical Engineering, Mashhad Branch, Islamic Azad University
Mahboobeh Houshmand: Department of Computer Engineering, Mashhad Branch, Islamic Azad University

DOI: https://doi.org/10.1186/s40537-024-00977-8
Journal volume & issue: Vol. 11, no. 1
pp. 1 – 22

Abstract

Read online

Abstract Crude oil is an essential energy source that affects international trade, transportation, and manufacturing, highlighting its importance to the economy. Its future price prediction affects consumer prices and the energy markets, and it shapes the development of sustainable energy. It is essential for financial planning, economic stability, and investment decisions. However, reaching a reliable future prediction is an open issue because of its high volatility. Furthermore, many state-of-the-art methods utilize signal decomposition techniques, which can lead to increased prediction time. In this paper, a model called K-means-dense-sparse-dense long short-term memory (K-means-DSD-LSTM) is proposed, which has three main training phrases for crude oil price forecasting. In the first phase, the DSD-LSTM model is trained. Afterwards, the training part of the data is clustered using the K-means algorithm. Finally, a copy of the trained DSD-LSTM model is fine-tuned for each obtained cluster. It helps the models predict that cluster better while they are generalizing the whole dataset quite well, which diminishes overfitting. The proposed model is evaluated on two famous crude oil benchmarks: West Texas Intermediate (WTI) and Brent. Empirical evaluations demonstrated the superiority of the DSD-LSTM model over the K-means-LSTM model. Furthermore, the K-means-DSD-LSTM model exhibited even stronger performance. Notably, the proposed method yielded promising results across diverse datasets, achieving competitive performance in comparison to existing methods, even without employing signal decomposition techniques.

Published in Journal of Big Data

ISSN: 2196-1115 (Online)
Publisher: SpringerOpen
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware; Technology: Technology (General): Industrial engineering. Management engineering: Information technology; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://journalofbigdata.springeropen.com

About the journal

Abstract

Keywords