Mitigating Adversarial Attacks in Object Detection through Conditional Diffusion Models

Xudong Ye; Qi Zhang; Sanshuai Cui; Zuobin Ying; Jingzhang Sun; Xia Du

doi:10.3390/math12193093

Mathematics (Oct 2024)

Mitigating Adversarial Attacks in Object Detection through Conditional Diffusion Models

Xudong Ye,
Qi Zhang,
Sanshuai Cui,
Zuobin Ying,
Jingzhang Sun,
Xia Du

Affiliations

Xudong Ye: Faculty of Data Science, City University of Macau, Macau SAR, China
Qi Zhang: Faculty of Data Science, City University of Macau, Macau SAR, China
Sanshuai Cui: Faculty of Data Science, City University of Macau, Macau SAR, China
Zuobin Ying: Faculty of Data Science, City University of Macau, Macau SAR, China
Jingzhang Sun: School of Cyberspace Security, Hainan University, Haikou 570228, China
Xia Du: School of Computer and Information Engineering, Xiamen University of Technology, Xiamen 361024, China

DOI: https://doi.org/10.3390/math12193093
Journal volume & issue: Vol. 12, no. 19
p. 3093

Abstract

Read online

The field of object detection has witnessed significant advancements in recent years, thanks to the remarkable progress in artificial intelligence and deep learning. These breakthroughs have significantly enhanced the accuracy and efficiency of detecting and categorizing objects in digital images. Nonetheless, contemporary object detection technologies have certain limitations, such as their inability to counter white-box attacks, insufficient denoising, suboptimal reconstruction, and gradient confusion. To overcome these hurdles, this study proposes an innovative approach that uses conditional diffusion models to perturb adversarial examples. The process begins with the application of a random chessboard mask to the adversarial example, followed by the addition of a slight noise to fill the masked area during the forward process. The adversarial image is then restored to its original form through a reverse generative process that only considers the masked pixels, not the entire image. Next, we use the complement of the initial mask as the mask for the second stage to reconstruct the image once more. This two-stage masking process allows for the complete removal of global disturbances and aids in image reconstruction. In particular, we employ a conditional diffusion model based on a class-conditional U-Net architecture, with the source image further conditioned through concatenation. Our method outperforms the recently introduced HARP method by 5% and 6.5% in mAP on the COCO2017 and PASCAL VOC datasets, respectively, under non-APT PGD attacks. Comprehensive experimental results confirm that our method can effectively restore adversarial examples, demonstrating its practical utility.

Published in Mathematics

ISSN: 2227-7390 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/mathematics

About the journal

Abstract

Keywords