Achieving descriptive accuracy in explanations via argumentation: The case of probabilistic classifiers

Emanuele Albini; Antonio Rago; Pietro Baroni; Francesca Toni

doi:10.3389/frai.2023.1099407

Frontiers in Artificial Intelligence (Apr 2023)

Achieving descriptive accuracy in explanations via argumentation: The case of probabilistic classifiers

Emanuele Albini,
Antonio Rago,
Pietro Baroni,
Francesca Toni

Affiliations

Emanuele Albini: Department of Computing, Imperial College London, London, United Kingdom
Antonio Rago: Department of Computing, Imperial College London, London, United Kingdom
Pietro Baroni: Dipartimento di Ingegneria dell'Informazione, Università degli Studi di Brescia, Brescia, Italy
Francesca Toni: Department of Computing, Imperial College London, London, United Kingdom

DOI: https://doi.org/10.3389/frai.2023.1099407
Journal volume & issue: Vol. 6

Abstract

Read online

The pursuit of trust in and fairness of AI systems in order to enable human-centric goals has been gathering pace of late, often supported by the use of explanations for the outputs of these systems. Several properties of explanations have been highlighted as critical for achieving trustworthy and fair AI systems, but one that has thus far been overlooked is that of descriptive accuracy (DA), i.e., that the explanation contents are in correspondence with the internal working of the explained system. Indeed, the violation of this core property would lead to the paradoxical situation of systems producing explanations which are not suitably related to how the system actually works: clearly this may hinder user trust. Further, if explanations violate DA then they can be deceitful, resulting in an unfair behavior toward the users. Crucial as the DA property appears to be, it has been somehow overlooked in the XAI literature to date. To address this problem, we consider the questions of formalizing DA and of analyzing its satisfaction by explanation methods. We provide formal definitions of naive, structural and dialectical DA, using the family of probabilistic classifiers as the context for our analysis. We evaluate the satisfaction of our given notions of DA by several explanation methods, amounting to two popular feature-attribution methods from the literature, variants thereof and a novel form of explanation that we propose. We conduct experiments with a varied selection of concrete probabilistic classifiers and highlight the importance, with a user study, of our most demanding notion of dialectical DA, which our novel method satisfies by design and others may violate. We thus demonstrate how DA could be a critical component in achieving trustworthy and fair systems, in line with the principles of human-centric AI.

Published in Frontiers in Artificial Intelligence

ISSN: 2624-8212 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.frontiersin.org/journals/artificial-intelligence#

About the journal

Abstract

Keywords