Applied Sciences (Jun 2024)

Research on User Default Prediction Algorithm Based on Adjusted Homogenous and Heterogeneous Ensemble Learning

  • Yao Lu,
  • Kui Wang,
  • Hui Sun,
  • Hanwen Qu,
  • Jiajia Chen,
  • Wei Liu,
  • Chenjie Chang

DOI
https://doi.org/10.3390/app14135711
Journal volume & issue
Vol. 14, no. 13
p. 5711

Abstract

Read online

In the field of risk assessment, the traditional econometric models are generally used to assess credit risk. And with the introduction of the “dual-carbon” goals to promote the development of a low-carbon economy, the scale of green credit in China has rapidly expanded. But with the advent of the big data era, due to the poor interpretability of a traditional single machine learning model, it is difficult to capture nonlinear relationships, and there are shortcomings in prediction accuracy and robustness. This paper selects the adjusted ensemble learning model based on the homogeneous and heterogeneous factors for user default prediction, which can efficiently process large quantities of high-dimensional data. This article adjusts each model to adapt to the task and innovatively compares various models. In this paper, the missing value filling method, feature selection, and ensemble model are studied and discussed, and the optimal ensemble model is obtained. When comparing the predictions of single models and ensemble models, the accuracy, sensitivity, specificity, F1-Score, Kappa, and MCC of Categorical Features Gradient Boosting (CatBoost) and Random undersampling Boosting (RUSBoost) all reach 100%. The experimental results prove that the algorithm based on adjusted homogeneous and heterogeneous ensemble learning can predict the user default efficiently and accurately. This paper also provides some references for establishing a risk assessment index system.

Keywords