Journal of Big Data (May 2025)

Optimal Markowitz portfolio using returns forecasted with time series and machine learning models

  • Damian Ślusarczyk,
  • Robert Ślepaczuk

DOI
https://doi.org/10.1186/s40537-025-01164-z
Journal volume & issue
Vol. 12, no. 1
pp. 1 – 40

Abstract

Read online

Abstract We aim to answer whether using forecasted stock returns based on machine learning and time series models in a mean-variance portfolio framework yields better results than relying on historical returns. Nevertheless, the problem of efficient stock selection has been tested for more than 50 years, and the issue of adequate construction of a mean-variance portfolio framework and incorporating forecasts of returns has not yet been solved. Stock returns portfolios were created using ‘raw’ historical returns and forecasted returns based on ARIMA-GARCH and the XGBoost models. Two optimization problems were concerned: global maximum information ratio and global minimum variance. Then strategies were compared with two benchmarks—an equally weighted portfolio and buy and hold on the Dow Jones Industrial Average (DJIA) index. Strategies were tested on DJIA stocks in the period from 2007-01-01 to 2022-12-31 and daily data was used. The main portfolio performance metrics were information ratio and modified information ratio. The results showed that by using forecasted returns, we can enhance our portfolio selection based on the Markowitz framework, but it is not a universal solution, and we have to control all the parameters and hyperparameters of selected models. This research paper contributes to the field of Financial Engineering by applying ARIMA-GARCH bootstrapping forecasts and XGBoost forecasts in portfolio optimization and testing a wide range of hyperparameters. Contrary to most similar papers in which a much smaller number of fits are performed, we present an approach that requires fitting tens of models each period, resulting in thousands of fits only in the base case scenario. Such a method requires specific handling and poses new challenges in testing and optimizing the models. Moreover, we are not aware of any other paper that compares the performance of ARIMA-GARCH bootstrapping forecasts and XGBoost forecasts in terms of portfolio optimization.

Keywords