IEEE Access (Jan 2022)

A Highly Stealthy Adaptive Decay Attack Against Speaker Recognition

  • Xinyu Zhang,
  • Yang Xu,
  • Sicong Zhang,
  • Xiaojian Li

DOI
https://doi.org/10.1109/ACCESS.2022.3220639
Journal volume & issue
Vol. 10
pp. 118789 – 118805

Abstract

Read online

Speaker recognition based on deep learning is currently the most advanced and mainstream technology in the industry. Adversarial attacks, an emerging and powerful attack against neural network models, also posing serious security problems for speaker recognition. Common gradient-based attack methods such as FGSM (Fast Gradient Sign Method), PGD (Projected Gradient Descent), and MI-FGSM (Momentum Iteration-FGSM) generate adversarial examples that are poorly stealthy and easily perceived by the human ear. To improve the stealthiness of the adversarial examples, this paper proposes a new attack method called the Adaptive Decay Attack (ADA), whose stealth is very close to the CW2(Carlini&Wagner) method based on optimization attacks, with much less computation time than CW2. The method takes the set number of iterations as the termination condition, automatically adjusts the size of the maximum perturbation according to whether the attack is successful or not, and then uses the decay methods in learning rates such as exponential decay and cosine annealing to continuously reduce the step size. The experimental results show that under the two speaker recognition models x-vector, and i-vector, the proposed attack method improves the stealthiness metrics such as SNR and PESQ by at least 30% and 39%, respectively, compared with the best PGD attack under speaker identification of untargeted attacks. For the speaker identification task with targeted attacks, the average improvement is at least 20% and 25% compared to PGD. For the speaker verification task, the improvement is at least 29.5% and 33.4% compared to PGD. In addition, we also use this attack method for adversarial training to enhance the robustness of the model. Experimental results show that ADA-based adversarial training takes 28.31% less time than PGD-based adversarial training, and its improved robustness is generally superior to PGD-based adversarial training. Specifically, the attack success rate of PGD and ADA methods decreased from 50.88% to 36.47% and 64.74% to 45.82%, respectively.

Keywords