Feature Selection in a Credit Scoring Model

Juan Laborda; Seyong Ryoo

doi:10.3390/math9070746

Mathematics (Mar 2021)

Feature Selection in a Credit Scoring Model

Juan Laborda,
Seyong Ryoo

Affiliations

Juan Laborda: Department of Business Administration, University Carlos III, 28903 Madrid, Spain
Seyong Ryoo: Leuven Statistics Research Centre, KU Leuven, 3000 Leuven, Belgium

DOI: https://doi.org/10.3390/math9070746
Journal volume & issue: Vol. 9, no. 7
p. 746

Abstract

Read online

This paper proposes different classification algorithms—logistic regression, support vector machine, K-nearest neighbors, and random forest—in order to identify which candidates are likely to default for a credit scoring model. Three different feature selection methods are used in order to mitigate the overfitting in the curse of dimensionality of these classification algorithms: one filter method (Chi-squared test and correlation coefficients) and two wrapper methods (forward stepwise selection and backward stepwise selection). The performances of these three methods are discussed using two measures, the mean absolute error and the number of selected features. The methodology is applied for a valuable database of Taiwan. The results suggest that forward stepwise selection yields superior performance in each one of the classification algorithms used. The conclusions obtained are related to those in the literature, and their managerial implications are analyzed.

Published in Mathematics

ISSN: 2227-7390 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/mathematics

About the journal

Abstract

Keywords