IEEE Access (Jan 2023)

Approximate Inverse Model Explanations (AIME): Unveiling Local and Global Insights in Machine Learning Models

  • Takafumi Nakanishi

DOI
https://doi.org/10.1109/ACCESS.2023.3314336
Journal volume & issue
Vol. 11
pp. 101020 – 101044

Abstract

Read online

Data-driven decision-making has become pervasive in the fields of interpretive machine learning and Explainable AI (XAI). While both fields aim to improve human comprehension of machine learning models, they differ in focus. Interpretive machine learning centers on deciphering outcomes in transparent, or ’glass-box,’ models, whereas XAI focuses on creating tools for explaining complex ’black-box’ models in a human-understandable way. Some existing interpretable machine learning and explainable AI methods have utilized a forward problem to derive how the prediction and estimation output results of a black-box model change with respect to the input. However, methods adopting the forward problem lead to non-intuitive explanations. Therefore, hypothesizing that the inverse problem can yield more intuitive explanations, we propose approximate inverse model explanations (AIME), which offer unified global and local feature importance by deriving approximate inverse operators for black-box models. Additionally, we introduce a representative instance similarity distribution plot, aiding comprehension of the predictive behavior of the model and target dataset. In our experiments with LightGBM, AIME proved effective across diverse data types, from tabular and handwritten digit images to text data. Results demonstrate that AIME’s explanations are not only simpler but more intuitive than those generated by well-established methods like LIME and SHAP. It also visualizes similarity distribution with the target dataset, illustrating the relation between different predictions. Furthermore, AIME estimates local and global feature importance and provides fresh insights by visualizing the similarity distribution between representative estimation instances and the target dataset.

Keywords