IEEE Access (Jan 2023)
Cost Harmonization LightGBM-Based Stock Market Prediction
Abstract
Stock market prediction (SMP) is a challenging task due to its uncertainty, nonlinearity, and volatility. Machine learning models, such as artificial neural networks (ANNs) and support vector regression (SVR), have been widely used for stock market prediction and achieved high performance in the sense of “minimum errors.” In the context of SMP, however, it is more meaningful to measure the performance using “minimum cost.” For example, a false positive error (FPE) could result in a big trading loss, while a false negative error (FNE) might just miss a chance. For a “cautious” investor, fewer FPEs are preferable. In fact, cost-sensitive learning has been used in areas such as fraud detection and medical diagnosis. In our earlier study, we proposed a false-sensitive method called focal-loss LightBGM (FL-LightGBM) for SMP by introducing a cost-aware loss in LightGBM, which is known to be a fast and efficient gradient-boosting learning algorithm for solving large-scale problems. FL-LightBGM, however, still assumes that all false negative errors (or false positive errors) contribute equally to the final cost. Such learned trading strategies might be useful only for an investor who is always “aggressive” or “cautious.” In practice, some errors may result in irreversible loss, so it is important to measure the cost based on “data” rather than the investor’s character. In this paper, we propose a new method called cost-harmonization loss-based LightGBM (CHL-LightGBM), in which the cost for each datum can be calculated dynamically based on the difficulty of the datum. To verify the effectiveness of CHL-LightGBM, comparisons have been made among LightGBM, XGBoost, decision trees, FL-LightGBM, and CHL-LightGBM for stock predictions on data from Shanghai, Hong Kong, and NASDAQ Stock Exchanges. The simulation results show that although there is no significant difference between CHL-LightGBM and other models on the accuracy and winning rate, CHL-LightGBM obtained the highest annual return on all the test data.
Keywords