BELLATREX: Building Explanations Through a LocaLly AccuraTe Rule EXtractor

Klest Dedja; Felipe Kenji Nakano; Konstantinos Pliakos; Celine Vens

doi:10.1109/ACCESS.2023.3268866

IEEE Access (Jan 2023)

BELLATREX: Building Explanations Through a LocaLly AccuraTe Rule EXtractor

Klest Dedja,
Felipe Kenji Nakano,
Konstantinos Pliakos,
Celine Vens

Affiliations

Klest Dedja: ORCiD; Department of Public Health and Primary Care, KU Leuven, Kortrijk, Belgium
Felipe Kenji Nakano: Department of Public Health and Primary Care, KU Leuven, Kortrijk, Belgium
Konstantinos Pliakos: Department of Management, Strategy, and Innovation, KU Leuven, Leuven, Belgium
Celine Vens: ORCiD; Department of Public Health and Primary Care, KU Leuven, Kortrijk, Belgium

DOI: https://doi.org/10.1109/ACCESS.2023.3268866
Journal volume & issue: Vol. 11
pp. 41348 – 41367

Abstract

Read online

Random forests are machine learning methods characterised by high performance and robustness to overfitting. However, since multiple learners are combined, they are not as interpretable as a single decision tree. In this work we propose a novel method that is Building Explanations through a LocalLy AccuraTe Rule EXtractor (Bellatrex), which is able to explain the forest prediction for a given test instance with only a few diverse rules. Starting from the decision trees generated by a random forest, our method: 1) pre-selects a subset of the rules used to make the prediction; 2) creates a vector representation of such rules; 3) projects them to a low-dimensional space; 4) clusters such representations to pick a rule from each cluster to explain the instance prediction. We test the effectiveness of Bellatrex on 89 real-world datasets and we demonstrate the validity of our method for binary classification, regression, multi-label classification and time-to-event tasks. To the best of our knowledge, it is the first time that an interpretability toolbox can handle all these tasks within the same framework. We also show that Bellatrex is able to approximate the performance of the corresponding ensemble model in all considered tasks, and it does so while selecting at most three rules from the whole forest. Finally, a comparison with similar methods in literature also shows that our proposed approach substantially outperforms other explainable toolboxes in terms of predictive performance.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords