Jisuanji kexue yu tansuo (Apr 2024)

Image-Text Retrieval Backdoor Attack with Diffusion-Based Image-Editing

  • YANG Shun, LU Hengyang

DOI
https://doi.org/10.3778/j.issn.1673-9418.2305032
Journal volume & issue
Vol. 18, no. 4
pp. 1068 – 1082

Abstract

Read online

Deep neural networks are susceptible to backdoor attacks during the training stage. When training an image-text retrieval model, if an attacker maliciously injects image-text pairs with a backdoor trigger into the training dataset, the backdoor will be embedded into the model. During the model inference stage, the infected model performs well on benign samples, whereas the secret trigger can activate the hidden backdoor and maliciously change the inference result to the result set by the attacker. The existing researches on backdoor attacks in image-text retrieval are based on the method of directly overlaying the trigger patterns on images, which has the disadvantages of low success rate, obvious abnormal features in poisoned image samples, and low visual concealment. This paper proposes a new backdoor attack method (Diffusion-MUBA) for image-text retrieval models based on diffusion models, designing trigger prompts for the diffusion model. Based on the correspondence between text keywords and regions of interest (ROI) in image-text pair samples, the ROI region in the image samples is edited to generate covert, smooth and natural poisoned training samples, to fine-tune through the pretrained model, establishing incorrect fine-grained word to region alignment in the image-text retrieval model, and embed hidden backdoors into the retrieval model. This paper designs the attack strategy of diffusion model image editing, proposes the backdoor attack model of bidirectional image-text retrieval, and achieves good results in the backdoor attack experiments of image-text retrieval and text-image retrieval. Compared with other backdoor attack methods, it improves the attack success rate, and avoids the impact of introducing specific characteristics of trigger patterns, watermarks, perturbations, local distortions and deformation in the poisoned samples. On this basis, this paper proposes a backdoor attack defense method based on object detection and text matching. It is hoped that the study on the feasibility, concealment, and implementation of backdoor attacks in image and text retrieval may contribute to the development of multimodal backdoor attack defense.

Keywords