Applied Artificial Intelligence (Dec 2024)
Performance Evaluation of Hybrid Machine Learning Algorithms for Online Lending Credit Risk Prediction
Abstract
ABSTRACTPeer-to-Peer systems are still in the early stages of development when it comes to the processing of credit and the appraisal of the risk associated with it. In this study, we used a hybrid convolutional neural network with logistic regression, a gradient-boosting decision tree, and a k-nearest neighbor to predict the credit risk in a P2P lending club. The lending clubs publicly available P2P loan data was used to train the model. In order to address the issue of data imbalance within the dataset, specifically between the non-defaulter and defaulter classes, the synthetic minority oversampling technique sampling approach is utilized. We developed the architecture of our hybrid model by removing the fully connected layer with the soft-max, which is the final layer of the fully connected CNN model and replaced by LR, GBDT, and k-NN algorithms. The experimental results show that the hybrid CNN-kNN model outperforms the CNN-GBDT and CNN-LR models based on the performance metrics accuracy, recall, F1-score, and area under the curve for both all input and important features. This shows that hybrid machine learning models effectively identify and categorize credit risk in peer-to-peer lending clubs, hence assisting in financial loss prevention.