Sistemas de Informação (Jun 2020)

Credit Assessment for Personal Loans using Machine Learning Techniques

  • NASCIMENTO, P. S.,
  • KOMATI, K. S.,
  • ANDRADE, J. O.

Journal volume & issue
Vol. 1, no. 25
pp. 28 – 41

Abstract

Read online

This paper describes the proposal of an archtecture and the comparison of regression methods for the analysis of personal loan proposals using machine learning algorithms. The case study presented here deals with data from a financial company that has a historical database of credit analyses, formed by the client's profile information, his previous relationship history with the financial company and the loan proposal data. It is important that the proposed architecture achieves the following objectives: decrease the cost associated with accessing credit protection agency's data, improve the accuracy compared to the existing process, allow the IT team keep a control variable over the approval or not of the proposal, and keep current employees' jobs. In order to meet all these requirements, a differential of this proposal is the usage of two stages of analysis, one without the data from credit protection agencies and the second with this type of data (if necessary). Another differential is that regression models will be used, whose results are converted into discrete results via comparison with denial/approval thresholds. In the first step, the result is approved { when it is above a pass threshold, denied { when it is below a fail threshold, or proceed to the second step. The second stage process is similar, except that cases that do not qualify as approved or denied continue for manual analysis. The methodology comprises exploratory analysis and pre-processing of data, calibration, validation and tests in two regression models: Random Forest and Partial Least Squares. The results of the experiments showed that the Random Forest achieved results that were better than both the Partial Least Squares and the existing system in the financial company. For this configuration, the first stage of the classification process is able to classify 86.56% of the proposals without manual intervention, and in the second stage, 4.04% are also classified automatically, in addition to reaching 97% accuracy at the end of the second stage.

Keywords