IEEE Access (Jan 2024)
Text Select-Backdoor: Selective Backdoor Attack for Text Recognition Systems
Abstract
Deep neural networks exhibit excellent image, voice, text, and pattern recognition performance. However, they are vulnerable to adversarial and backdoor attacks. In a backdoor attack, the target model identifies input data unless it contains a specific trigger, at which point it mistakenly recognizes the altered data. In a backdoor attack, an attacker employs a specific trigger to initiate the attack. In this paper, we propose a selective backdoor sample that the ally (or “friend”) text recognition model correctly recognizes but misrecognizes in the enemy’s text recognition model. The proposed method involves training friend and enemy models on backdoor sentences with a specific trigger; the friend’s model accurately classifies these samples, while the enemy’s model incorrectly identifies them. In our experimental evaluation, we use the TensorFlow library and two datasets related to movie reviews (MR and IMDB). In the experiment, an attack success rate of 100% was achieved by the proposed method against the enemy’s model using backdoor samples with a trigger in front of the sentence when there were approximately 1% backdoor samples in the training data. In addition, the accuracy of the friend’s model for the backdoor samples and that of the original samples were maintained at 85.2% and 86.5% (MR dataset) and 90.6% and 91.2% (IMDB dataset), respectively.
Keywords