IET Image Processing (Feb 2022)

Detecting adversarial examples by additional evidence from noise domain

  • Song Gao,
  • Shui Yu,
  • Liwen Wu,
  • Shaowen Yao,
  • Xiaowei Zhou

DOI
https://doi.org/10.1049/ipr2.12354
Journal volume & issue
Vol. 16, no. 2
pp. 378 – 392

Abstract

Read online

Abstract Deep neural networks are widely adopted powerful tools for perceptual tasks. However, recent research indicated that they are easily fooled by adversarial examples, which are produced by adding imperceptible adversarial perturbations to clean examples. Here the steganalysis rich model (SRM) is utilized to generate noise feature maps, and they are combined with RGB images to discover the difference between adversarial examples and clean examples. In particular, a two‐stream pseudo‐siamese network that fuses the subtle difference in RGB images with the noise inconsistency in noise features is proposed. The proposed method has strong detection capability and transferability, and can be combined with any model without modifying its architecture or training procedure. The extensive empirical experiments show that, compared with the state‐of‐the‐art detection methods, the proposed approach achieves excellent performance in distinguishing adversarial samples generated by popular attack methods on different real datasets. Moreover, this method has good generalization, it trained by a specific adversary can defend against other adversaries effectively.