IEEE Access (Jan 2024)

Effective Credit Risk Prediction Using Ensemble Classifiers With Model Explanation

  • Idowu Aruleba,
  • Yanxia Sun

DOI
https://doi.org/10.1109/ACCESS.2024.3445308
Journal volume & issue
Vol. 12
pp. 115015 – 115025

Abstract

Read online

Credit risk prediction is a critical task in the financial industry, allowing lenders to assess the likelihood of a borrower defaulting on a loan. Traditional machine learning (ML) classifiers have been widely used for this purpose, and they often struggle with imbalanced data and lack interpretability, making it challenging for financial institutions to make informed decisions. This article explores the use of ensemble classifiers and Synthetic minority over-sampling Edited nearest neighbor (SMOTE-ENN) technique in credit risk prediction, aiming to improve the classification performance. The ensemble classifiers include Random Forest, adaptive boosting (AdaBoost), extreme gradient boosting (XGBoost), and light gradient boosting machine (LightGBM). The study addresses the class imbalance issue by leveraging ensemble classifiers and the SMOTE-ENN technique while employing Shapley additive exPlanations (SHAP) for model interpretability. The experimental results showed that the proposed approach resulted in improved classification performance. Specifically, on the German credit dataset, XGBoost outperformed the other models with a Recall of 0.930 and a Specificity of 0.846, while Random Forest obtained the best performance on the Australian dataset, achieving a Recall of 0.907 and Specificity of 0.922. Additionally, the integration of SHAP enhanced the models’ transparency by providing valuable insights into the contribution of individual features, which is crucial for informed financial decision-making.

Keywords