Annals of the University of Oradea: Economic Science (Dec 2023)
DECISION TREE OR LOGISTIC REGRESSION - WHICH BASIC MODEL IS BETTER?
Abstract
In this paper, my aim is to show which of the data in the Central Credit Information System are the ones that influence the factors that are then used to perform the analysis using a decision tree and logistic regression, and I would like to know, which of the two basic model is the better one. For the analyses, I used a random sample of 500 items, reflecting the proportions of performing and nonperforming loans in the population. For both methods, one variable was found to be significant, which was the ratio of the repayment to the contract amount, so this is the most significant of the data recorded by the Central Credit Information System in terms of loan defaults. If I compare the two methods, I can conclude that both methods have a high level of accuracy, but logistic regression is the one that produced better results, as it was able to identify a higher proportion of defaulted loans. Unfortunately, the decision tree could not identify any defaulting loans despite its higher classification accuracy. The reason can be the unfavourable sample composition. Finally, the logistic regression was able to categorize the transactions with 81,1% accuracy and has better AUC value and better value for Gini coefficients.
Keywords