网络与信息安全学报 (Apr 2022)
Adversarial examples defense method based on multi-dimensional feature maps knowledge distillation
Abstract
The neural network approach has been commonly used in computer vision tasks.However, adversarial examples are able to make a neural network generate a false prediction.Adversarial training has been shown to be an effective approach to defend against the impact of adversarial examples.Nevertheless, it requires high computing power and long training time thus limiting its application scenarios.An adversarial examples defense method based on knowledge distillation was proposed, reusing the defense experience from the large datasets to new classification tasks.During distillation, teacher model has the same structure as student model and the feature map vector was used to transfer experience, and clean samples were used for training.Multi-dimensional feature maps were utilized to enhance the semantic information.Furthermore, an attention mechanism based on feature map was proposed, which boosted the effect of distillation by assigning weights to features according to their importance.Experiments were conducted over cifar100 and cifar10 open-source dataset.And various white-box attack algorithms such as FGSM (fast gradient sign method), PGD (project gradient descent) and C&W (Carlini-Wagner attack) were applied to test the experimental results.The accuracy of the proposed method on Cifar10 clean samples exceeds that of adversarial training and is close to the accuracy of the model trained on clean samples.Under the PGD attack of L2 distance, the efficiency of the proposed method is close to that of adversarial training, which is significantly higher than that of normal training.Moreover, the proposed method is a light-weight adversarial defense method with low learning cost.The computing power requirement is far less than that of adversarial training even if optimization schemes such as attention mechanism and multi-dimensional feature map are added.Knowledge distillation can learn the decision-making experience of normal samples and extract robust features as a neural network learning scheme.It uses a small amount of data to generate accurate and robust models, improves generalization, and reduces the cost of adversarial training.
Keywords