Autonomous Intelligent Systems (Feb 2024)
Shapley value: from cooperative game to explainable artificial intelligence
Abstract
Abstract With the tremendous success of machine learning (ML), concerns about their black-box nature have grown. The issue of interpretability affects trust in ML systems and raises ethical concerns such as algorithmic bias. In recent years, the feature attribution explanation method based on Shapley value has become the mainstream explainable artificial intelligence approach for explaining ML models. This paper provides a comprehensive overview of Shapley value-based attribution methods. We begin by outlining the foundational theory of Shapley value rooted in cooperative game theory and discussing its desirable properties. To enhance comprehension and aid in identifying relevant algorithms, we propose a comprehensive classification framework for existing Shapley value-based feature attribution methods from three dimensions: Shapley value type, feature replacement method, and approximation method. Furthermore, we emphasize the practical application of the Shapley value at different stages of ML model development, encompassing pre-modeling, modeling, and post-modeling phases. Finally, this work summarizes the limitations associated with the Shapley value and discusses potential directions for future research.
Keywords