Jisuanji kexue yu tansuo (Jun 2024)

Interpretable Machine Learning Algorithm Based on Rules Ensemble and Its Application

  • MIN Jiyuan, LU Tongyu, REN Tingting, CHEN Ruhao

DOI
https://doi.org/10.3778/j.issn.1673-9418.2310026
Journal volume & issue
Vol. 18, no. 6
pp. 1476 – 1490

Abstract

Read online

Machine learning algorithms have achieved great success due to their excellent predictive performance, but their applicability is limited in areas where there is a high demand for model interpretability. Aiming at the weakness of lacking interpretability of machine learning algorithms, a new interpretable machine learning algorithm called ensemble trees penalized logistic rule regression is proposed based on the idea of rules ensemble, which can achieve comparable predictive performance to the ensemble trees algorithm with less structural complexity and retains the interpretive effect of logistic regression. Firstly, it extracts branches from ensemble trees such as random forest and XGBoost, and converts them into logic rules. Then, the rule set is pruned and deduplicated to obtain a streamlined rule set. Finally, the rules are incorporated into logistic regression as variables and complexity control is performed with Lasso algorithm. Taking the enterprise risk warning as an example, it is compared with multiple machine learning algorithms. The results show that this algorithm not only inherits the default discrimination ability of the ensemble trees well and exceeds most of the machine learning algorithms in various classification indices, but also can give the thresholds of the enterprise risk indices through the rules, which is convenient for enterprises to carry out risk management. Further, the enterprise credit score is produced according to this algorithm, which verifies its wide applicability. The obtained score conforms to the objective law and is discriminative, and the robustness of the model’s prediction performance is verified by three public datasets.

Keywords