IEEE Access (Jan 2022)
GSM-HM: Generation of Saliency Maps for Black-Box Object Detection Model Based on Hierarchical Masking
Abstract
Interpretability of DNN-based object detection has been a rising concern for the research community. The first step towards this goal is a saliency map that visualizes the importance (saliency) of pixels in an image for the object detected by a specific model. Black-box based methods generate a saliency map without the need to look into the internals of a model, thus applicable to all models without the need of adaptation. In addition, they provide more reliable evaluation on the saliency of pixels than white-box methods by means of the absence of these pixels from the image. However, with current black-box methods, the absence of pixels is produced by random image masks. Despite the need of a great number of random masks for sufficient coverage, the quality of the pixel saliency is not assured to be satisfactory. In this work, we propose a more effective black-box framework with hierarchical masking. In this framework, called GSM-HM, pixel saliency is evaluated at multiple levels, with each lower level performing a refinement on the saliency information of the upper level. This hierarchical framework significantly reduces the masking efforts on less valuable pixels, thus it can produce saliency maps with higher qualities. In our experiments, the quality of a generated saliency map is evaluated with four different metrics: deletion, insertion, convergence and RAM (the ratio of average to maximum). Compared with D-RISE, a recent black-block method, GSM-HM generates more accurate saliency maps evaluated by these metrics.
Keywords