Heliyon (Aug 2019)

Stock selection with random forest: An exploitation of excess return in the Chinese stock market

  • Zheng Tan,
  • Ziqin Yan,
  • Guangwei Zhu

Journal volume & issue
Vol. 5, no. 8
p. e02310

Abstract

Read online

In recent years, a variety of research fields, including finance, have begun to place great emphasis on machine learning techniques because they exhibit broad abilities to simulate more complicated problems. In contrast to the traditional linear regression scheme that is usually used to describe the relationship between the stock forward return and company characteristics, the field of finance has experienced the rapid development of tree-based algorithms and neural network paradigms when illustrating complex stock dynamics. These nonlinear methods have proved to be effective in predicting stock prices and selecting stocks that can outperform the general market. This article implements and evaluates the robustness of the random forest (RF) model in the context of the stock selection strategy. The model is trained for stocks in the Chinese stock market, and two types of feature spaces, fundamental/technical feature space and pure momentum feature space, are adopted to forecast the price trend in the long run and the short run, respectively. It is evidenced that both feature paradigms have led to remarkable excess returns during the past five out-of-sample period years, with the Sharpe ratios calculated to be 2.75 and 5 for the portfolio net value of the multi-factor space strategy and momentum space strategy, respectively. Although the excess return has weakened in recent years with respect to the multi-factor strategy, our findings point to a less efficient market that is far from equilibrium.

Keywords