Scientific Reports (May 2025)

Efficient black-box attack with surrogate models and multiple universal adversarial perturbations

  • Tao Ma,
  • Hong Zhao,
  • Ling Tang,
  • Mingsheng Xue,
  • Jing Liu

DOI
https://doi.org/10.1038/s41598-025-87529-z
Journal volume & issue
Vol. 15, no. 1
pp. 1 – 19

Abstract

Read online

Abstract Deep learning models are inherently vulnerable to adversarial examples, particularly in black-box settings where attackers have limited knowledge of the target model. Existing attack algorithms often face challenges in balancing effectiveness and efficiency. Adversarial perturbations generated in such settings can be suboptimal and require large query budgets to achieve high success rates. In this paper, we investigate the transferability of Multiple Universal Adversarial Perturbations (MUAPs), showing that they can affect a large portion of samples across different models. Based on this insight, we propose SMPack, a staged black-box adversarial example generation algorithm that integrates surrogate and query schemes. By combining MUAPs with surrogate models, SMPack effectively overcomes the black-box constraints and improves the efficiency of generating adversarial examples. Additionally, we optimize this process using a Genetic Algorithm (GA), allowing for efficient search of the perturbation space while conserving query budget. We evaluated SMPack against eight popular attack algorithms: OnePixel, SimBA, FNS, GA, SFGSM, SPGD, FGSM, and PGD, using four publicly available datasets: MNIST, SVHN, CIFAR-10, and ImageNet. The experiments involved 500 random correctly classified samples for each dataset. Our results show that SMPack outperforms existing black-box attack methods in both attack success rate (ASR) and query efficiency, while maintaining competitive performance with white-box methods. SMPack provides an efficient and effective solution for generating adversarial examples in black-box settings. The integration of MUAPs, surrogate schemes, and genetic optimization addresses the key limitations of existing methods, offering a robust alternative for generating adversarial perturbations with reduced query budget.