IEEE Access (Jan 2023)

EvalAttAI: A Holistic Approach to Evaluating Attribution Maps in Robust and Non-Robust Models

  • Ian E. Nielsen,
  • Ravi P. Ramachandran,
  • Nidhal Bouaynaya,
  • Hassan M. Fathallah-Shaykh,
  • Ghulam Rasool

DOI
https://doi.org/10.1109/ACCESS.2023.3300242
Journal volume & issue
Vol. 11
pp. 82556 – 82569

Abstract

Read online

The expansion of explainable artificial intelligence as a field of research has generated numerous methods of visualizing and understanding the black box of a machine learning model. Attribution maps are commonly used to highlight the parts of the input image that influence the model to make a specific decision. At the same time, numerous recent research papers in the fields of machine learning and explainable artificial intelligence have demonstrated the essential role of robustness to natural noise and adversarial attacks in determining the features learned by a model. This paper focuses on evaluating methods of attribution mapping to find whether robust neural networks are more explainable, particularly within the application of classification for medical imaging. However, there is no consensus on how to evaluate attribution maps. To solve this, we propose a new explainability faithfulness metric, EvalAttAI, that addresses the limitations of prior metrics. We evaluate various attribution methods on multiple datasets and find that Bayesian deep neural networks using the Variational Density Propagation technique are consistently more explainable when used with the best performing attribution method, the Vanilla Gradient. Our results suggest that robust neural networks may not always be more explainable, despite producing more visually plausible attribution maps.

Keywords