Tongxin xuebao (Mar 2021)
Generalized Grad-CAM attacking method based on adversarial patch
Abstract
To verify the fragility of the Grad-CAM, a Grad-CAM attack method based on adversarial patch was proposed.By adding a constraint to the Grad-CAM in the classification loss function, an adversarial patch could be optimized and the adversarial image could be synthesized.The adversarial image guided the Grad-CAM interpretation result towards the patch area while the classification result remains unchanged, so as to attack the interpretations.Meanwhile, through batch-training on the dataset and increasing perturbation norm constraint, the generalization and the multi-scene usability of the adversarial patch were improved.Experimental results on the ILSVRC2012 dataset show that compared with the existing methods, the proposed method can attack the interpretation results of the Grad-CAM more simply and effectively while maintaining the classification accuracy.