IEEE Access (Jan 2020)
A Novel GSCI-Based Ensemble Approach for Credit Scoring
Abstract
Credit scoring is an efficient tool for financial institutions to implement credit risk management. In recent years, many novel machine learning models have been developed for credit scoring. Among the existing machine learning models, the heterogeneous ensemble model receives much attention because of its superior performance. This paper presents a new heterogeneous ensemble model based on the generalized Shapley value and the Choquet integral. To do this, the model first uses the fuzzy measure to express the interactive characteristics between any two coalitions of base learners. Based on the accuracy and diversity objective function, a linear programming model for determining the fuzzy measure is built. To retain the original information as much as possible in the training stage, the normal fuzzy number is employed to express the base learner predicted values. Then, the generalized Shapley Choquet integral (GSCI) aggregation operator is defined to calculate the comprehensive predicted value of the ensemble model. Based on the defined aggregation operator and linear programming model, a GSCI approach is proposed for ensemble credit scoring. To illustrate the efficiency and feasibility of the GSCI approach, an experiment of thirteen machine learning models over four public credit scoring datasets and three real-world P2P leading datasets with large volumes of samples is made. Furthermore, robust tests and comparatives analysis are made to demonstrate the adaptability and performance of the GSCI-based ensemble model.
Keywords