Algorithms (Apr 2022)
Simple Black-Box Universal Adversarial Attacks on Deep Neural Networks for Medical Image Classification
Abstract
Universal adversarial attacks, which hinder most deep neural network (DNN) tasks using only a single perturbation called universal adversarial perturbation (UAP), are a realistic security threat to the practical application of a DNN for medical imaging. Given that computer-based systems are generally operated under a black-box condition in which only input queries are allowed and outputs are accessible, the impact of UAPs seems to be limited because well-used algorithms for generating UAPs are limited to white-box conditions in which adversaries can access model parameters. Nevertheless, we propose a method for generating UAPs using a simple hill-climbing search based only on DNN outputs to demonstrate that UAPs are easily generatable using a relatively small dataset under black-box conditions with representative DNN-based medical image classifications. Black-box UAPs can be used to conduct both nontargeted and targeted attacks. Overall, the black-box UAPs showed high attack success rates (40–90%). The vulnerability of the black-box UAPs was observed in several model architectures. The results indicate that adversaries can also generate UAPs through a simple procedure under the black-box condition to foil or control diagnostic medical imaging systems based on DNNs, and that UAPs are a more serious security threat.
Keywords