Minimally Distorted Adversarial Images with a Step-Adaptive Iterative Fast Gradient Sign Method

Ning Ding; Knut Möller

doi:10.3390/ai5020046

AI (Jun 2024)

Minimally Distorted Adversarial Images with a Step-Adaptive Iterative Fast Gradient Sign Method

Ning Ding,
Knut Möller

Affiliations

Ning Ding: Institute of Technical Medicine, Furtwangen University, 78054 Villingen-Schwenningen, Germany
Knut Möller: Institute of Technical Medicine, Furtwangen University, 78054 Villingen-Schwenningen, Germany

DOI: https://doi.org/10.3390/ai5020046
Journal volume & issue: Vol. 5, no. 2
pp. 922 – 937

Abstract

Read online

The safety and robustness of convolutional neural networks (CNNs) have raised increasing concerns, especially in safety-critical areas, such as medical applications. Although CNNs are efficient in image classification, their predictions are often sensitive to minor, for human observers, invisible modifications of the image. Thus, a modified, corrupted image can be visually equal to the legitimate image for humans but fool the CNN and make a wrong prediction. Such modified images are called adversarial images throughout this paper. A popular method to generate adversarial images is backpropagating the loss gradient to modify the input image. Usually, only the direction of the gradient and a given step size were used to determine the perturbations (FGSM, fast gradient sign method), or the FGSM is applied multiple times to craft stronger perturbations that change the model classification (i-FGSM). On the contrary, if the step size is too large, the minimum perturbation of the image may be missed during the gradient search. To seek exact and minimal input images for a classification change, in this paper, we suggest starting the FGSM with a small step size and adapting the step size with iterations. A few decay algorithms were taken from the literature for comparison with a novel approach based on an index tracking the loss status. In total, three tracking functions were applied for comparison. The experiments show our loss adaptive decay algorithms could find adversaries with more than a 90% success rate while generating fewer perturbations to fool the CNNs.

Published in AI

ISSN: 2673-2688 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.mdpi.com/journal/ai

About the journal

Abstract

Keywords